We have a new #1 open-source AI image generator! (RIP Flux)

Estimated read time: 1:20

Summary

In a groundbreaking announcement, "Hydream" has surged to the top as the leading open-source AI image generator, dethroning Flux in the AI community. Created by Vivago, this generator excels in both quality and uncensored content generation, offering detailed and accurate images, particularly when compared to other models like Flux Dev, Stable Diffusion 3.5, and XL. The video dives into various tests showcasing Hydream's prowess in handling intricate prompts and maintaining anatomical accuracy, making it a standout in the current AI landscape. Instructions for installing and running Hydream locally are also provided, ensuring users can fully harness its capabilities on their own systems.

Highlights

Hydream tops the leaderboard as the leading open-source image generator, surpassing Flux. 🏆
Offers superior quality without censorship, allowing for creative freedom in generation. 🎨
Hydream outperforms other models in rigorous prompt testing, displaying impressive accuracy. 📸
Local installation guides help users run Hydream effectively on their own devices, maximizing usage. 🖥️
Although Hydream is currently number one, further model fine-tuning could yield even better results. 🔧

Key Takeaways

Hydream by Vivago takes the lead as the top open-source AI image generator, beating out Flux and others. 🚀
Completely uncensored, Hydream excels at generating high-quality and anatomically accurate images. 😎
The video compares Hydream with other open-source models, showcasing its superior handling of complex prompts. 🤖
Full installation and usage guide provided, allowing users to run Hydream locally with ease. 💻
Flux might have lost its crown, but there's potential in fine-tunings and checkpoints. 🤔

Overview

The video begins by spotlighting a major shift in the AI image generation space: Hydream, an open-source model by Vivago, has claimed the top spot on the leaderboard. Known for its uncensored output and superior quality, Hydream's emergence marks a significant evolution, with comparisons showing its dominance over other renowned models like Flux and Stable Diffusion.

Throughout the presentation, the creator of the video conducts various tests comparing Hydream to other top models under tricky and comprehensive prompts. The results consistently show Hydream's capability to produce accurate, detailed, and vivid images, particularly excelling in scenarios demanding anatomical precision and realistic representations.

The latter part of the video transitions into a detailed tutorial on how to install and run Hydream locally, providing viewers with the tools and knowledge to access this powerful AI model on personal systems. This empowers users to leverage Hydream's full potential, making this shift towards more accessible AI technology exceedingly relevant and exciting.

Chapters

00:00 - 00:30: Introduction to Hydream The chapter introduces Hydream, a new leading open-source image generator developed by Vivago. According to an independent leaderboard by Artificial Analysis, Hydream is ranked as the top open-source text-to-image model. It surpasses the next best open-source model, Flux 1D, which is significantly lower in ranking. Additionally, the chapter highlights that Flux Pro is a closed-source model.
00:30 - 03:00: Hydream vs Other Open-Source Models In this chapter titled 'Hydream vs Other Open-Source Models,' the speaker discusses a closed-source model that's impressively effective, outperforming even well-known models like Flux. The highlight of this model is its ability to be completely uncensored from the start, allowing it to generate a wide range of content based on user prompts. The speaker plans to showcase its capabilities by testing it against challenging prompts and comparing the outputs with leading open-source image models. Additionally, viewers will be guided on how to install and run this model locally for unlimited use.
03:00 - 19:30: Testing with Various Prompts This chapter focuses on the application and installation of a particular tool. The initial part includes demonstrations of the tool's capabilities, illustrated through a visual prompt of a ballerina mid-leap, highlighting the descriptive features such as the leg position and arm arc under a spotlight. The latter part of the chapter promises a detailed, step-by-step guide to installing and running the tool locally.
19:30 - 20:30: Ad Testing and Humva Sponsorship In this chapter titled 'Ad Testing and Humva Sponsorship', the speaker is conducting a demonstration of ad testing using a sophisticated model. They discuss running multiple iterations and selecting the best outcome from three generations, focusing particularly on the accuracy of hands and fingers in digital images. The speaker highlights the effectiveness of the model in accurately capturing human anatomy, contrasting it with another model (presumably stable diffusion 3) which lacks such advanced capabilities.
20:30 - 21:30: Testing Different Art Styles and Animals The chapter discusses generating human images using various AI art models, focusing on the comparison between Hydream and other open-source models like Flux Dev, Stable Diffusion 3.5, and Stable Diffusion XL. The chapter observes differences in censorship and open-source accessibility. It explicitly excludes comparison with the closed-source Flux Pro to ensure an apples-to-apples analysis.
21:30 - 30:00: Installation and Setup Instructions This chapter discusses the capability of 'Hydream' in generating an accurate image of a ballerina frozen mid-leap during a performance, highlighting its superiority over other models like Flux, SD3.5, and SDXL in terms of anatomical accuracy and detail, especially concerning the depiction of fingers and facial features.
30:00 - 35:00: Conclusion and Call to Action The chapter details a test of a complex isometric 3D scene creation prompt involving various elements: a man working at a desk, an empty bookshelf, a pet cat on a bed, and more. The scene setup is described to emphasize the intricacy and detail of the visual layout.

We have a new #1 open-source AI image generator! (RIP Flux) Transcription

00:00 - 00:30 this is crazy news we have a new top open-source image generator it's called Hydream by Vivago and don't take my word for it if you look at this independent leaderboard by artificial analysis which ranks all the text to image models you can see that Hydream is currently the number one open-source model out there note that the next best open- source model is all the way down here Flux 1D also note that Flux Pro is not open-
00:30 - 01:00 source this is a closed source model it's pretty insane how good this is yes this even beats Flux plus it's completely uncensored right out of the gate so you can prompt it to generate some pretty wild things plus the models are already out so in this video I'm going to test it on a series of really tricky prompts and compare this with the other top open-source image models out there so you can see how good this is plus I'll show you how to install this and run it locally for unlimited times on your computer now before we go into
01:00 - 01:30 where to use it and how to install this let's first look at some demos here's what the interface looks like and don't worry in the latter half of the video I'm going to show you step by step how to install this and run it locally on your computer but for now let's just go over some demos to show you what this can and cannot do so here's the first prompt a ballerina frozen mid leap during a performance her left leg is extended forward arms forming a soft arc above her head spotlight on her let's
01:30 - 02:00 click run and see what that gives us and note that I'm using the full model for all my generations and here's what we get now I'm going to run this a few more times and take the best of three generations all right so here's the best out of three generations in terms of the hands and fingers it looks like this one is the most accurate so I'm going to go with this one but note that for all three generations it was able to get the anatomy of the woman very accurately so this isn't like a nerfed model like stable diffusion 3 where it can't even
02:00 - 02:30 generate humans properly because of its censorship this is completely uncensored and then here's the same prompt note that I'm comparing this with the other top open-source models out there Flux Dev Stable Diffusion 3.5 and Stable Diffusion XL which is still pretty good to be honest notice that I did not compare this with Flux Pro because that is closed source so here I'm only comparing Hydream with open-source models so that it's an applesto apples comparison note that for the prompt I
02:30 - 03:00 did specify for the ballerina to be frozen mid leap during a performance so out of all these four images Hydream was the only one that was able to generate her in mid leap plus anatomically everything looks pretty accurate to me note that for all the other models it couldn't really generate the fingers very accurately including Flux and SD3.5 and SDXL plus notice for SDXL the face is kind of messed up so I would say out of these four generations Hydream produced the best image here all right
03:00 - 03:30 next up I tested this prompt an isometric 3D scene of a bedroom there is a man sitting on a red chair at a wooden desk working on a laptop there's a white empty bookshelf a pet cat is curled up on a gray bed with white pillows beside the bed is a nightstand with a lamp and alarm clock the wall is teal there is a window with white curtains there are some house plants and an acoustic guitar hanging on the wall and this prompt is purposely very complicated with a lot of
03:30 - 04:00 different objects in different colors and positions and I'm really trying to see if the image model can understand and generate all of this at once so let's look at high dreams generation first here we do see a man sitting on a red chair working at a wooden table on his laptop there is a bookshelf but it's not empty there is a cat lying on a gray bed with white pillows there is a nightstand with a lamp and alarm clock plus there is a guitar hanging on the wall and there's a window with white curtains so this looks pretty good to me
04:00 - 04:30 if you look at Flux Dev there is a man sitting on a red chair i'm not sure if the desk is wooden the bookshelf is not empty plus the nightstand is missing an alarm clock and note that this is the best out of three generations for SDXL the guitars look messed up there is no alarm clock on this nightstand plus the bed is messed up the cat is messed up this just has a lot of errors and then for SDXL this isn't really an isometric
04:30 - 05:00 3D scene plus it's missing the bed but it's missing a guitar there are just a lot of errors with this so in terms of following the prompt especially if it's really complicated with a lot of details again I would have to give the point to Hydream it's really good at prompt adherence all right next the prompt is a realistic school yearbook photo page with student photos each student has a unique outfit and expression so here's Hydream's example it even put school year at the top of the page and the text
05:00 - 05:30 is completely correct we do have a nicel lookinging grid of realistic looking yearbook photos the only flaw I can spot is that there are some repeats of the same faces so this kid and this kid and this one and this one kind of look the same same with these two girls but other than that I mean these do look like realistic yearbook photos now if we look at Flux Dev this does kind of look like a yearbook photo although all these students do have kind of a plastic face
05:30 - 06:00 which is common among Flux generations i would say these don't really look like yearbook photos especially if you compare it with this generation by Hydream however I do like that this does look like a page in a yearbook plus it even has some madeup text at the bottom of each photo for the names of the students and then here's stable diffusion three it's just not great i mean the quality of these faces are really blurry plus this grid isn't even
06:00 - 06:30 aligned properly like some photos are smaller some photos are larger plus some rows have names some rows don't have names it's just not a very consistent generation and then SDXL the quality of these portrait photos is actually not bad especially if you zoom out this does look quite realistic but as soon as you zoom in notice that there are a lot of deformationations and weird things going on with the faces plus this doesn't really look like a page in a yearbook this looks more like a poster of a graduation class I would say so out of
06:30 - 07:00 all these examples although Hydream isn't great there are repetitions of some of these faces but overall this does look like the best grid of yearbook photos let me know in the comments what you think since GTA 6 is never going to happen let's just create it with AI so here the prompt is create the cover of the video game GTA 6 for PS5 the design should be for a standard PS5 video game case so here is Hydream's example it has
07:00 - 07:30 a very beautiful and standard PS5 design here with the Grand Theft Auto logo all the text looks correct the only flaw I can really point out is that this age rating does not look correct plus this logo over here but the design overall does have that GTA vibe to it and then if you look at Flux Dev this also has a perfect PS5 logo plus GTA 6 plus the text here looks correct we have some strange watermark here for some reason
07:30 - 08:00 and then these two symbols and logos do not look correct in terms of the design of the game I would say it doesn't really have as much of a GT vibe to it as this one and then for both SD3.5 and SDXL it just couldn't really generate a PS5 case even though I've tested this prompt three times for each model so again out of the four generations if I had to pick a winner I would go with Hydream but let me know in the comments what you think all right here again I'm
08:00 - 08:30 testing it on a really tricky prompt with things that don't really go together in real life so here the prompt is a tiger with butterfly wings playing chess against a translucent ghost so here's Hydreams generation this is the best of three generations and there is a tiger with butterfly wings this is a translucent ghost plus they are playing chess the only minor flaw I can spot is that the chess board does not look accurate i think there should be more
08:30 - 09:00 squares on this grid but at least it does have pieces in both colors and then here is Flux Dev's generation it chose to generate this in like 2.5D which is okay i didn't specify for this to be a realistic photo so this is a tiger with butterfly wings there is a cute ghost and they are playing chess but in terms of the accuracy and realism of this chess board and the chess pieces notice that Hydreams generation is more accurate so here we only have pieces of
09:00 - 09:30 one color and then for SD3.5 it's just a total nightmare it could not handle this prompt even after prompting it three times and then for SDXL it can generate a tiger without butterfly wings here we have a strange ghost with butterfly wings and then the chess game is not accurate it has pieces in three colors for some reason so again out of all these examples if I had to pick a winner I would go with Hydream it's really good at handling these tricky prompts all right here's another tricky prompt a
09:30 - 10:00 hand holding a pen writing in a diary on the page there is handwritten text that says "This is quite a long piece of text much longer than what typical AI image models could generate accurately." This is a test to see if all the text will show up correctly here also remember to subscribe to my channel that's your cue to do so by the way so here's high dreams generation this does not look like handwritten text but it did get most of the text correct however because this is a super long sentence it did
10:00 - 10:30 mess up in the middle here but somehow it was able to get it correct again near the end and then here's Flux Dev's generation and this is the best out of three generations so it could not get all of the text correct in fact this text doesn't even make sense and then here is SD3's generation i guess it got the first two words correct and then it started messing up again plus oh my god look at the fingers here it can't even hold a pen that's because SD 3.5 is
10:30 - 11:00 pretty nerfed in terms of understanding human anatomy and then for SDXL the hand holding the pen looks really good this does look like handwritten text but the text does not say what I specified for it to say even though Hydream's generation is not perfect in terms of actually generating long snippets of text you can see that it's currently the best open-source model for this all right next I wanted to see if it can generate some lower quality amateur photos so here the prompt is a teenage
11:00 - 11:30 woman holding a handwritten note that says "Verify me 49205 low quality selfie photo poor lighting amateur." So here's high dreams generation the text is accurate however this looks way too polished look at the shallow depth of field there's too much blur in the background this looks like a professional photo flux Dev looks a bit better you can see the background is less blurry the text looks correct so this looks more like an amateur
11:30 - 12:00 lowquality selfie photo and then for SD3.5 this looks even better so this is kind of the grainy lowquality photo that I was going for however notice that the slash here is messed up and oh my god look at this hand look at these fingers again SD 3.5 is just not good with anatomy and then we have SDXL surprisingly this is the style that I was going for so in terms of style SDXL
12:00 - 12:30 actually looks the best however the text is not correct plus her hand is messed up so it's hard to pick a winner here in terms of the least amount of errors and actually looking like an amateur photo I would have to go with Flux Dev in this case all right next the prompt is a modern UI for a consulting firm website so here is Hydreams Generation it got some of the text correct so for example consulting firm up here and then contact and then your business and this does
12:30 - 13:00 look like a very minimalist modern website for a consulting firm here is Flux Dev's generation this is also not bad this does look like a professional website although from the photos this looks more like a psychologist or psychiatrist session more than a consultant let me know in the comments if you think the same thing and then here is SD 3.5 this is the best out of three generations and you can see it's not great and then here's SDXL it's not
13:00 - 13:30 too bad but the text is messed up the people are messed up this is only good for like really rough prototyping so out of all four examples if I were to pick the one that looks like a web page for a consulting firm I don't know i would have to go with Hydream let me know in the comments what you think all right next I wanted to see if it can generate actual existing people so the prompt is Will Smith Iron Man and Queen Elizabeth having dinner together so here is High Dreams example will Smith doesn't really
13:30 - 14:00 look like Will Smith but Iron Man looks perfect plus so does Queen Elizabeth everything looks super detailed it's hard to find any flaws with this for Flux Dev the crazy thing is even out of Free Generations I could not get Iron Man in the photo it could only generate Will Smith and Queen Elizabeth for some reason and both of them do not look like the actual person so definitely not as realistic looking as Hydream and then for SD3.5 this is actually not bad they kind of look more like Will Smith and
14:00 - 14:30 Queen Elizabeth but again even for three generations it just could not generate Iron Man and then SDXL actually looks pretty good this is the best render of Will Smith but there seems to be like three clones of Queen Elizabeth plus again Iron Man is missing so I don't know what's going on here but if you look at all four generations again if I were to pick a winner based on the prompt I would have to go with Hydream in the next prompt I wanted to see if it can generate hands and fingers
14:30 - 15:00 realistically so the prompt is two hands making a heart symbol and you can see for both Hydream and Flux Dev they were able to pull it off for SD3.5 and SDXL it's just total nightmare fuel so in this instance I would have to give the point to both Hydream and Fluxdev one thing to note is that Hydream is completely uncensored if you download the local version you can generate pretty much anything right away trust me that was the first thing I tried now obviously on YouTube I can't show you
15:00 - 15:30 any of this so here's the best alternative so the prompt is a beautiful young woman she is smiling with her tongue sticking out she is wearing a white bikini at the beach view from above perfect body so here's Hydream's generation she is smiling with her tongue sticking out and everything else does look pretty good and then here's the generation from Flux Dev she's also smiling with her tongue sticking out and everything else does look realistic and accurate and then here is SD 3.5 she is
15:30 - 16:00 smiling with her tongue sticking out however this top does not look accurate plus oh my god what do we have here and then for SDXL the quality of this is actually surprisingly good but notice she does not have her tongue sticking out so in terms of actually following the prompt I would have to give the points to Hydream and Flux Dev all right next I wanted to see if it can generate anime so the prompt is anime style a girl in the city at night so here is
16:00 - 16:30 Hydream's generation this looks pretty good to me it does have some flux vibes to it in that the background is super blurry and then here is Flux Dev's generation again everything looks perfect but the background is kind of a bit too blurry to be anime and if you compare both photos notice that the face kind of looks the same the background kind of looks the same so it's very similar and then here is SD 3.5 the background does look a bit more anime it's not as blurry but all the text is
16:30 - 17:00 messed up and then surprisingly here is SDXL this is a beautiful generation and if you compare all four generations I would have to give the point to SDXL this looks beautiful and the background looks more anime than the other generations thanks to Humva for sponsoring this video no more spending hours on expensive video shoots human is an AI powered avatar platform that lets you create highquality spokesperson videos in minutes easily create
17:00 - 17:30 tutorials testimonials and social media content with any character and dialogue you want no need for a camera editing skills or even a real person here's how it works first you can choose from hundreds of presenters to be the face of your content they can range from realistic people to 3D cartoons like this or if you want to use yourself just upload a single photo and create your own digital avatar next you can choose from a huge selection of voices in multiple languages let's make this
17:30 - 18:00 awesome together let's make this awesome together let's make this awesome together let's make this awesome together whatever you're aiming for you're in the right place let's make it happen they also offer support for many different languages so it's really easy to generate content to reach a global audience this is perfect for businesses influencers and marketers looking to create content and boost engagement whether you need a sales pitch product intro an onboarding tutorial or a social
18:00 - 18:30 media post Humva makes it ridiculously easy to create professionallooking videos at a low cost to celebrate their official launch Humva is offering an exclusive first month free trial for new users see full details in the description below all right next I also wanted to test its ability to generate different art styles so here the prompt is Manet style impressionist painting of a deer in a forest and for your reference an impressionist painting by
18:30 - 19:00 Manet looks like this so the point of these paintings is you're supposed to take a few steps back and kind of blur your vision a bit to see the scene like the people here the grass the ships in the background nothing is really defined in shape these are just brush strokes but if you add all of this together and you zoom out and you blur your vision then you do see a nice picture now here are the generations for all four models note that none of them could actually get this impressionist art style correct
19:00 - 19:30 so even for Hydream the deer is way too defined and then for Flux Dev this is even worse this just looks like a cartoon or something and then for SD3.5 this is actually not bad we do see more of this brushstroke effect however again the deer just kind of looks too defined here and then for SDXL this actually looks the best to me it's not perfect but if you compare it back to this reference image then I would pick SDXL as the winner here all right next I
19:30 - 20:00 wanted to test its ability to generate uncommon animals because most image generators out there they can generate common animals like dogs and cats but they completely failed to generate more uncommon species so here the prompt is a pair of spectral tarscers on a tree and for your information spectral tarscers look like this they're actually primates and they have massive eyes and these guys live in Sului some people call them the real life Baby Yoda actually let me
20:00 - 20:30 also copy this reference image and paste it on the side here but you can see that none of them could actually generate a realistic looking spectral tarscier hydream came close but this looks more like a lemur than a tarsier flux Dev and SD3.5 just got it completely wrong and then surprisingly SDXL is actually the best looking one here this still looks horrible but out of these four models I would say SDXL is the winner here all
20:30 - 21:00 right so that sums up my test of Hydream compared to the other top open-source image generators i hope that gives you a good and comprehensive idea of what it can and cannot do overall after testing all those tricky prompts Hydream did score the most points and you know after a few days of testing this at least for me I can confirm that this is the best open- source model out there now one thing to note is that I'm kind of doing a disservice by generating photos just
21:00 - 21:30 from the base models because all of these are open source people can then fine-tune better checkpoints and luras based on the base model so for example for flux I would never use the flux base model i would use a more realistic or more detailed checkpoint same with SDXL i would never use the base model i would use a fine-tuned version like Real Viz XL or Juggernaut to generate even more detailed and realistic images so I'm sure in a really short amount of time people are going to start creating
21:30 - 22:00 fine-tuned checkpoints and Loras with this Hydream base model which are going to be even better than the generations I showed you today and plus because this is completely uncensored you can bet that there's going to be a lot of checkpoints and loras for corn like literally any position or fetish you can think of people are going to fine-tune something for that anyways that sums up my test next let's go over how and where you can actually use this so here's their official GitHub repo and note that
22:00 - 22:30 they've launched three different models the full model is of course the best quality but this does take longer to generate one image and they recommend that you set at least 50 inference steps to generate an image using this model and then we also have dev which is a bit faster but lower quality and then we have the fast model which is the fastest but the lowest quality but honestly even the fast model is pretty good this model does require an Nvidia GPU if you don't have that you can run this online so
22:30 - 23:00 they have released a free hugging face notice that this only uses the dev model which is slightly lower quality but faster so the hugging face space is just over here and it's pretty simple to use so you would just enter your prompt here select the aspect ratio and then click generate image however because this is hosted on hugging face this space is censored for free unlimited and uncensored generations you would have to download this locally and that's what we're going to go over right now note
23:00 - 23:30 that you do need to have a CUDA GPU however their official models do require an insane amount of VRAM to run i think it's like over 60 GB or something crazy like that the good news is there are already quantized versions of this one option is this fork which I'll link to in the description below this is a 4-bit quantized model so there's some sacrifice in quality but this allows you to run it with as low as 16 GB of VRAM
23:30 - 24:00 plus it includes a nice interface for you to generate images so you don't have to work with code however this method does lack some additional settings like the CFG scale or the number of steps so another place to use Hydream with more customizability would be through this Comfy UI Hydream Sampler and if you scroll down here it says the quantized version requires roughly 15 GB of VRAM so it's not great i understand some of you might have even lower VRAM but this
24:00 - 24:30 is what we have right now i'm sure in the near future there's going to be even more efficient versions of this which you can run with lower VRAM but anyways in this video I'm going to go over how to install this Hydream Sampler for Comfy UI so I'll link to this page in the description below as well over here it contains all the instructions on how to set this up let's go over this step by step and by the way this assumes that you already have Comfy UI installed on your computer if you don't see this tutorial where I go over how to install
24:30 - 25:00 and use Comfy UI step by step it's actually really easy once you get the hang of it so the first step is to get Flash Attention from Hugging Face and install it over here now I'm using Windows so this tutorial will mainly be for Windows users and it's quite a pain in the ass to install Flash Attention for Windows so make sure you follow this step by step first we need to go to this hugging face page which contains all the pre-built Flash Attention wheels for
25:00 - 25:30 Windows if you click on this files tab this contains all the wheels for different builds of Torch and CUDA and Python so how do you know which one to install assuming you have Comfy UI installed here is our Comfy UI folder and you should see this Python embedded folder so let's double click on that and then at the top here let's type in cmd to open this up in command prompt now in order to check what version of Python you have for comi you just need to type
25:30 - 26:00 in python.exe- version so let's press enter and here it says I'm using Python 3.11 all right the next thing we need to check is what version of CUDA you have installed like I said a moment ago this only works if you have a CUDA GPU so anyways in order to check your CUDA version simply type in nvcc-- version and here you should see that I'm using CUDA 12.4 all right the final thing that we need to check is what version of Torch you have installed
26:00 - 26:30 now to see if you even have torch installed for comfy UI simply type in python.exe-m pip list and then press enter and this would basically list out all the packages and dependencies you have existing for comfy UI so if you scroll down here you should see that at least for me I have Torch installed and this is for 2.5.1 and then the CUDA version is 12.4 if you don't see Torch installed then you would need to type in
26:30 - 27:00 python.exe so this uses Python and then pip to install torch torch vision and torch audio but I already have this installed so I don't need to do this step all right so given all this information going back to this hugging face page let's select the appropriate flash attention wheel to download so I'm using torch 2.5.1 and then CUDA 12.4 so let's find that so here's torch 2.5.1 and then CUDA 12.4 4 and then I'm using
27:00 - 27:30 Python 11 so it's going to be this one so here you can see Python 3.11 and then CUDA 12.4 torch 2.5.1 so let's download this one i'm going to click on this button to download it and you can download this wherever you want i've just put this in my downloads folder so afterwards I'm going to rightclick here and then click on copy as path all right so next going back to our command prompt within this Python embedded folder to
27:30 - 28:00 install/attention we need to type in python.exe- s-M pip install and this basically uses pip to install our downloaded wheel and then I will just paste in the path of the wheel which we just copied so let's press enter and then wait for this to complete perfect so that was pretty quick you can see here it says successfully installed flash attention all right so now that we got the hard part done the next step is to install accelerate so again in this
28:00 - 28:30 Python embedded folder we simply need to run this code so let me copy this and then going back to our command prompt make sure you are within this Python embedded folder and then we would paste this code in here i already have accelerate installed so this didn't install anything additional for me all right the next step is to install this node with Comfy Manager so we can exit out of this command prompt window and then in our Comfy UI folder let's start up Comfy UI so I'm going to double click on this run nvidia GPU.bat which will
28:30 - 29:00 start up the interface all right so after you see your Comfy UI loaded up let's click on manager and then click on custom nodes manager and let's search for Hydream over here it looks like this new node isn't registered in their database yet but that's okay we can also install it via a git URL so I'm going to click on this and then I'm just going to copy the URL of this GitHub and paste it over here so let's click okay and wait
29:00 - 29:30 for this to install perfect so it says it has been installed please restart Comfy UI so I'm going to close this and then close and then click restart and notice when I restart Comfy UI again it's now installing this Hydream sampler as well as other dependencies so this is going to take a few minutes all right so after downloading and installing all the packages at least for me I got this additional error no module named Triton so it looks like we need to also install
29:30 - 30:00 Triton so in order to do that let me exit out of this first and then again in my confui folder in python embedded I'm going to click at the top here and then type in cmd to open up command prompt in python embedded and then I'm going to paste in this so this is going to use python and pip to install or upgrade triton for windows let's press enter and wait for this to install as well and after this is done then hopefully everything should work all right so here it says successfully installed Triton
30:00 - 30:30 for Windows so let's exit out of this and then one final thing we need to do is that we need to add this in our run nvidia GPU.bat file so let me go back to my comfy UI folder and then in this file run Nvidia GPU.bat let me rightclick this and then click edit in notepad and then I'm going to copy this and then paste it in here so I'm going to press CR S to save it and then exit out of this so now let me doubleclick on this
30:30 - 31:00 to start up Comfy UI again and you can see when I start it up it says use flash attention over here all right so after you load up Comfy UI the next step is to simply drag and drop the workflow of this onto your Comfy UI interface so here is the basic workflow let's click on this and then click download and you can download this wherever you want i'm going to call it Hydream.json and then click save and the nice thing about Comfy UI is this might look really complicated to you with all
31:00 - 31:30 these nodes and noodles but you don't actually need to set up anything yourself you can just drag and drop an existing workflow onto here so for example I'm going to take this Hydream.json which I just downloaded and then simply drop it onto Comfy UI and this is pretty much it here's how you would use Hydream in Comfy UI here's where you would select the model type for my first test run I'm just going to use this fast model i'm going to leave the prompt the same resolution I'm going to leave square and then next we have seed this is basically the starting point of the image so if you set all the
31:30 - 32:00 settings the same and you keep the same seed you're going to generate the exact same image as before if you use a different seed number it's still going to generate a photo of an astronaut riding a horse on the moon but it's going to be slightly different from another image with a different seed usually I just set the seed to randomize and then for a number of steps you kind of need to know how diffusion models work so it starts off with an image of random noise and at each step it takes away some of that noise until eventually
32:00 - 32:30 you get your final image so the number of steps is basically how many steps of denoising you want the model to take the more steps you have the more detailed your model will be but at a certain point you're going to get diminishing returns so if you want to leave it at the default you would just set the value to minus one but if you want to change it to like 20 steps then you would set the value to 20 over here i'm just going to leave this back to minus1 and then cfg is basically how literally do you want the AI to follow your prompt so if you set this to a really high value like
32:30 - 33:00 10 for example it's going to follow your prompt really literally whereas if you set this to like two then it's going to be more creative and not follow your prompt so much again for this I'm just going to leave it at the default value so I'm going to set this to minus one so let's click run and see if this works note that the first time you run this it is going to have to download the model first so if I open up my command prompt it's now fetching and installing the
33:00 - 33:30 Hydream models all right so that took like 30 minutes to download all the checkpoints but note that this is only the first time you run this workflow afterwards it's going to be really quick and as you can see here even with the fastest and smallest model it was able to generate a really beautiful photo of an astronaut riding a horse on the moon now note that this is just a preview image node so in order to save it you would have to rightclick this and then click on save image if you wanted to just autosave every time then you can
33:30 - 34:00 also get rid of this node and instead let's double click anywhere and then search for save image and select this node and then we just need to connect the noodle to this node so if we use save image instead of preview image then the image would be saved every time in your com UI output folder and that's pretty much how you would set this up pretty easy to get started so that sums up my review and tutorial on Hydream at least from my few days of initial
34:00 - 34:30 testing this does seem to be the best open-source image generator we have right now and yes this even beats Flux let me know in the comments what you think of this if you've had a chance to play around with Hydream what other impressive generations were you able to come up with and as always if you run into any issues or errors during the installation welcome to paste the error message in the comments below and I'll try to help you troubleshoot as much as possible as always I will be on the lookout for the top AI news and tools to
34:30 - 35:00 share with you so if you enjoyed this video remember to like share subscribe and stay tuned for more content also there's just so much happening in the world of AI every week i can't possibly cover everything on my YouTube channel so to really stay uptodate with all that's going on in AI be sure to subscribe to my free weekly newsletter the link to that will be in the description below thanks for watching and I'll see you in the next one