ComfyUI: So Easy, Even Your Cat Can Do It (But Please Don't Let Them). Part 1 - The Basics.
Estimated read time: 1:20
Learn to use AI like a Pro
Get the latest AI workflows to boost your productivity and business performance, delivered weekly by expert consultants. Enjoy step-by-step guides, weekly Q&A sessions, and full access to our AI workflow archive.
Summary
This video tutorial simplifies the complexities of ComfyUI, an interface for the Stable Diffusion model, known for its steep learning curve. It guides you through the basics of installing ComfyUI on your local machine, setting up nodes, and using the interface to generate various media forms such as text-to-image, image-to-image, and more. With a casual and engaging approach, the tutorial ensures you understand the ComfyUI interface, the workflow process including downloading and using models, and manipulating nodes to create custom outputs.
Highlights
ComfyUI helps simplify the intricate processes of stable diffusion models 🎨.
The installation guide eases the setup process for ComfyUI on local machines 💾.
Nodes are fundamental to ComfyUI, functioning as information units to be processed 🧩.
The video explains workflows in a friendly yet detailed manner, making complex ideas more approachable 🤓.
Through understanding and manipulating nodes, you can create optimized workflows to suit different needs ⚙️.
Practical demonstrations of sampler settings and their effects on image generation are covered, providing insight into optimizing outputs 📊.
Key Takeaways
ComfyUI is flexible and supports various input types like text-to-image and image-to-image 🖼️.
Installing ComfyUI is easier than it seems, with step-by-step guidance provided for different operating systems 💻.
Understanding nodes in ComfyUI is crucial as they represent different functions within the workflow 🧠.
The tutorial provides helpful tips on managing and organizing nodes to create efficient and effective workflows ✔️.
Learning the role of 'clip' helps in better instructing models on image generation processes 🔍.
Experimentation with different sampler methods and settings is recommended to get desired results 🎨.
Overview
The tutorial opens with an introduction to ComfyUI, highlighting its prominence as one of the most flexible interfaces for the Stable Diffusion model. Despite its intimidating learning curve, the video assures viewers that it simplifies the learning process by thoroughly explaining each component of the interface. This foundational understanding sets the stage for mastering ComfyUI.
Viewers are guided step-by-step through the installation process of ComfyUI. The video details how to install the interface on various operating systems, like Windows or Linux, stressing that despite seeming complex, the instructions are straightforward and user-friendly. The guide ensures that anyone, regardless of their tech proficiency, can set up ComfyUI and start experimenting with it.
The latter part of the video delves into more technical aspects, such as understanding and manipulating nodes within ComfyUI. The tutorial explains how nodes function within this system, comparing them to functions that take information inputs, process them, and output the results. With this knowledge, viewers can create customized workflows, experiment with different settings, and ultimately utilize ComfyUI to its fullest potential.
Chapters
00:00 - 00:30: Introduction and ComfyUI Overview The chapter provides an overview of ComfyUI, highlighting it as a highly flexible interface for the Stable Diffusion model. Despite its steep learning curve, the video series aims to simplify understanding of ComfyUI by explaining the concepts of nodes and the functionality of different parts of the interface. By the end, viewers are expected to master ComfyUI and create workflows to meet various needs.
00:30 - 02:30: Installing ComfyUI The chapter provides an introduction to installing ComfyUI, emphasizing that it can handle various media transformations like text-to-image and text-to-video. It reassures the reader that the installation is simpler than it seems. The first step involves visiting ComfyUI's GitHub page and scrolling down to find the installation section.
02:30 - 06:00: Run ComfyUI and Install Manager Node The chapter provides a guide on how to run ComfyUI and install the Manager Node, with a focus on Windows operating systems. It acknowledges the presence of step-by-step instructions available for Linux and Apple systems as well, emphasizing that these instructions are user-friendly and not overly technical. The chapter particularly delves into the installation process on Windows, serving as a helpful resource for users with various operating systems.
06:00 - 10:00: Download Model and Generate Images The chapter titled 'Download Model and Generate Images' begins by addressing potential issues users might face while installing 'comfyi' on their local machine. The speaker offers assistance with any questions through comments. The chapter provides a step-by-step guide to downloading the necessary file. Specifically, it instructs users to click a direct link, which will initiate the download of a file named 'confi Windows portable Nvidia', found in the downloads folder after completion. This file is a 7-zip archive.
10:00 - 16:00: Understanding Nodes and Workflow This chapter explains the basics of handling files using compression tools like WinRAR, WinZip, or 7-Zip. The process is described as straightforward: right-click on the file, select 'extract files', and choose a destination for the extracted files. An example is provided where files are extracted into a new folder named 'confy UI' on the D drive. It is mentioned that the links to these tools are provided in the description for those unfamiliar with them.
16:00 - 25:00: Sampler Configuration and Settings The chapter, titled 'Sampler Configuration and Settings,' discusses the process of installing and using WinRAR software for file extraction. It mentions both the paid and free versions, recommending the latter as sufficient. The chapter guides the reader through the installation process and how to extract files by navigating to the downloads folder and selecting the extract option. It notes that the extraction process takes time due to the large number of files.
25:00 - 30:50: Experiment with Sampler and Outputs The chapter titled 'Experiment with Sampler and Outputs' provides instructions on accessing and running a software program on Windows. The narrator directs users to locate a 'kyui Windows portable' folder and to run files with a '.bat' extension. It explains that the program works best with an Nvidia GPU but can also run on a CPU if necessary. Users unsure of their hardware capabilities are advised to attempt running it on an Nvidia GPU first and to switch to CPU if they experience any issues. Close attention is needed to the windows that open during the process to ensure the correct setup.
30:50 - 42:00: Workflow Customization and Saving The chapter discusses the difference in performance between CPU and GPU versions when generating images or performing other actions. It notes that using a CPU will result in slower processing times compared to using a GPU. The reader is instructed to double-click on the 'Run Nvidia GPU' option, which opens a command line terminal window. Once this process is completed, a browser window will open, displaying a page with a specific local address.
42:00 - 42:00: Conclusion The chapter titled 'Conclusion' discusses the final steps involved in setting up a system with a focus on enhancing user interaction through a dedicated interface. It highlights the utilization of a local machine to access a user interface, known as comy UI, and the forthcoming exploration of this interface. An essential addition, termed 'manager', is introduced, which automates several features previously requiring manual effort, thereby streamlining processes and enhancing user experience through automation.
ComfyUI: So Easy, Even Your Cat Can Do It (But Please Don't Let Them). Part 1 - The Basics. Transcription
00:00 - 00:30 hi everyone in this video series we're going to take a look at comy UI comy UI without a doubt is one of the most flexible interfaces for the stable diffusion model and although it's got a very steep learning curve we're going to simplify everything and in this video series you're going to understand all the nodes concept what each part of the interface does and you're going to become a mastering comy UI and you will be able to generate workflows for almost every requirement you have starting with
00:30 - 01:00 simple text to image image to image text to video image to video and everything in between so let's get started without any delay the first thing that we need to do is actually to install comy UI on our local machine it sounds complicated but it's much easier than you think just head over to the comi page it's a GitHub page if you're not familiar with GitHub it's okay just go go to their page and scroll down until you see installing config UI clicking on it will take you
01:00 - 01:30 directly to the links where they have Specific Instructions for each operating system we're going to take a look at the windows because I have a Windows operating system but if you have a Linux or an Apple operating system you can also just go step by step based on the instructions they made these are very friendly instructions they're not that technical so let's just take a look at the windows installing of course if
01:30 - 02:00 you're having any issues installing the comfyi on your local machine you can just leave a comment and I will do my best to answer any questions that you might have so we're just going to go to the direct link to download clicking it we'll simply download the file let's Let It complete download and see what happens next and once the download is complete you will find in your downloads folder a file called confi Windows portable Nvidia this is a seven zip file
02:00 - 02:30 and the process is very simple just right click and click extract files then you can simply select where you want to extract the files to I going to create a separate folder on my D drive and I'm going to call it confy UI and just click okay if you are not familiar with the concept of WinRAR or windzip or seven zip it's rather simple I added the link in the description
02:30 - 03:00 you can just download this WinRAR software there is a paid version but you also have the free version which is more than enough and then once you install it after the installation you can just go to your downloads folder right click on the file and here you will have the extract files information the extraction process will take some time because there are thousands of files that it needs to open up once we've done with the extraction of the files all you have to do is go to the folder where you extract ract to the
03:00 - 03:30 files there you will see a folder called kyui Windows portable double click it go inside and you will see two files with a bat at the end comy Y is basically best to use with a GPU of Invidia if you don't have one you can run it on a CPU if you're not sure if you have the right GPU or not just try running Nvidia GPU if you see that it fails or something like that simply close the window it opens and run
03:30 - 04:00 the CPU version please do note that if you don't have a GPU the time that will take to generate images or to do any actions will be much slower than using a GPU so you just have to Simply double click on the Run Nvidia GPU and it will open up this black window this terminal window it's called a command line and once it's done it will pop up the browser and it will open up a page with this address add this is a local address
04:00 - 04:30 and basically it opens comy UI on your local machine we will go over the interface a bit later but before that there's one more step that I want you to do and it is to install something called manager what the manager node does it adds features to the interface itself so some things that are requiring some manual work are now automated directly through the interface it is much easier
04:30 - 05:00 to manage your models your luras your uh your custom nodes that we will discuss in later uh videos but it is very important and very vital so let's see how we are going to do that we'll just go to the link of comy UI manager and right there you will have instructions how to install it this is how you install it you're going to the folder where you extracted the files where you had the Run Nvidia GPU B at file over
05:00 - 05:30 there you will see a folder called comy UI click it go inside it and here you will see a folder called custom nodes double click it again you will see that it's an empty folder click on the address bar of your file explorer and type CMD clicking CMD will open this black window and there all you have to do is go to the website copy this line called git
05:30 - 06:00 clone copy it as is go back to the command line and simply paste it and click enter wait for a few seconds it's very fast and once you're done just go to your comy UI you can close the tab of the browser close the terminal where the comi started and then go back to your folder where the comi installation is and double click run Nvidia GPU once
06:00 - 06:30 again what it does it actually reloads everything and installs the manager it will take a few seconds and then after installing all the required files it will pop up the browser again and this time we will see that we have a button called manager on the menu clicking manager will show you a lot of features that we're not going to discuss right now but we will go over everything
06:30 - 07:00 here the one thing that we do need to do is to download our model so the way to download our model is simply through the manager go to install models and here what we want to look for is sdxl once we click it on the search click search and you will see that you have a lot of models and we want to download the base of sdxl Base you can see it right here and just click on the in install
07:00 - 07:30 once you click install you can see it tells you here that it is installing the model into your folder it will take some time because it actually downloads the model from the hugging face website if you're not familiar with hugging face it's okay you're not you don't have to know it because the com UI manager actually takes care of everything and while it's downloading if you want to understand a bit more what's happening behind the scene if you'll open up the terminal where the comi started you can see here that it actually downloads the
07:30 - 08:00 model file you can see a progress bar here and it tells you that it's downloading this is the download URL so the download is almost complete as you can see we are at the final percentage and once the download is done you will see that the note becomes the information becomes red and it tells you that in order to apply the installed model you need to refresh the main menu so all you have to do is go here to
08:00 - 08:30 the button called refresh and once you click it you will see that here on the checkpoint you will see the sdxl base safe tensors once we're done with the steps we can actually start generating images if you are wondering where is the model that we just downloaded you can simply go to the folder of the confy UI where we extracted the files and go to the confi windows portable there under the confi folder you will see a folder called
08:30 - 09:00 models these folders we will go over all these things in more details but this folder holds all the relevant models for the comy UI interface and the checkpoint is where we downloaded the models too this is where we put the main models the model themselves and the sdxl base you can see it's a 6 Giga file which is pretty big I suggest you'll have a at least 40 or 50 gigaby free uh to
09:00 - 09:30 download models because there are a lot of models and a lot of other things that we need to download in order to make the best out of the confy UI software so here you can see the file called sdxl base and once we're going back to the interface if we want to just check that everything works fine we can click this Q prompt button you can see that the Q
09:30 - 10:00 size is now one and some process has started when it comes to comy UI the workflow keeps highlighting the phase that I'm currently processing and you can see that it started from the loading checkpoint went over to the prompts we will go over all of these nodes later on but now I just want you to see the workflow in action and how simple it was to install it and get it started and you
10:00 - 10:30 can see we indeed have a beautiful scenery nature glass battle landscape with a purple Galaxy Buttle if we click qu prompt again it will generate another image this time with a different image because we're using a random seed so let's just make sure that everything works fine great you can see it generated another bottle and now just for the more technical concept if I'll
10:30 - 11:00 take a look at the console where I opened the comy UI I do recommend if you have multiple screen on one screen to always leave this window open because it shows you what happens behind the scenes if I click Q prompt and go back to the screen you can see that it shows me the progress bar it actually fits the progress bar that you can see here but it shows me the advancement of the generation and you can see see it also once again generated images so now let's
11:00 - 11:30 clear the entire interface you can just click here the button called the clear and it asks you are you sure you want to clear the workflow let's click okay and step by step build this default workflow and throughout the process I will explain every node in details what it does and why do you need to use it so what is a node basically a node is a
11:30 - 12:00 function that gets information what we call an input and then using all kinds of settings or variables or parameters changes this data and outputs it so if I'm going to take a look at an example node never mind what this node does it has its inputs this is the the information that it can receive it it has its parameter
12:00 - 12:30 which decides how it's going to process the information that it just got and it's got the output this output in turn can go to a new node once again disregard what this node does all you have to know is that this node got information the information was processed and now the information is outputed into another node as an as its
12:30 - 13:00 input and here on this input you might not have parameters but it does some kind of function on the data that it received and it can output another node there are some nodes that don't have an input and there are some nodes that don't have an output this node for example its output is it's actual data that it's processing it generates images for example
13:00 - 13:30 this node don't have an input the only input it has it's this parameter which loads a file so now that we know what a node is how can I use a node well you can actually add a node in one of several ways the simplest way is simply double clicking the mouse and then you will have a popup with a search I can then say Okay I want to load something once I type it will dynamically show me the list of nodes that match the search term we want to load a checkpoint a
13:30 - 14:00 checkpoint is actually the model file that I can select from here if you remember in the beginning of the video we downloaded a checkpoint called sdxl which is located inside the confi models checkpoints folder this is the file and here the parameter is the actual file name what this node does it actually loads the file and outputs the model
14:00 - 14:30 that this file holds a checkpoint is basically the name we call a model that is being trained and in the middle of training you create checkpoints of the training process you can use them later on and load them to use them on another process so when we load a checkpoint we basically load a checkpoint of a model that was trained where it's weights and functions and algorithms are correctly
14:30 - 15:00 set up that it fits our needs stable diffusion XL is one of those models and we loaded the checkpoint stable diffusion in order to work with it and generate images with it so now one of the output of this node is the actual model which is the actual weights and functions of the model that I want to use another one is a clip and the third one one is a vae throughout this video
15:00 - 15:30 we will have some technical terms but I'm not going to dive deep into the technical Concepts but rather explain in a more generic sense so you'll have an understanding what each thing on this workflow does if you want to drill down a bit more to the details you can simply look it up in Google and you will have more than enough information about how all the technical Parts work but the next thing on our workflow is
15:30 - 16:00 clip what is a clip so in order to understand what clip does to a model you can imagine you playing a game with a friend where the friend is blindfolded and you you need to instruct him what to draw on a board the more detailed you will be the better it will be able to draw things on the board clip Works in kind of a similar way you enter a text and this clip instructs the model what to paint and where to paint it so if I'm
16:00 - 16:30 going to write something like a dog walking on a grass the clip will instruct a model to generate image that relates to a dog and image that relates to a grass but how the model knows what's a grass and what's a dog this actually happens in the training process when we generate a model when we train a model to generate pictures we give it and fed it with thousands of images and we give a description for each image from those images that have a
16:30 - 17:00 description the computer generates something called a latent you can imagine the latent as a word cloud millions of words that have a connection between them this generates kind of a multi-dimensional connection between each word and each image so when you're using a clip you're basically telling the model look at the latent space and generate things that are relating to the words that I gave you when we we're talking about a coherent model it means
17:00 - 17:30 that it's got a better understanding of the prompts that I've written with the images that it generated now let's see what nodes can use the clip information from the model so now that I know what I want to generate for example let's say that I want to generate a dog running on a road I can simply go to the clip area and you can see another way to generate nodes in comy UI is go and stand on the output or the input by the way of the node that I want to connect and click
17:30 - 18:00 and drag and it will generate a link directly once I let go it will suggest to me what nodes I want to find if my node is not showing up here I can simply click search and search for the node but in our case we want to create a clip text in code what clip text in code does is actually takes a prompt so we said we want to generate a dog running on the road on a road and it knows to take it
18:00 - 18:30 and convert it to a language the sampler later on will understand what sampler is knows to speak and it goes on and generates something called conditioning conditioning is basically the process of taking these texts converting them to conditions that based on these conditions the sampler that we later on talk about knows how to use the model to generate images in stable diffusion we can use two texts in codes so I can from
18:30 - 19:00 that clip drag once again and generate another text en code the reason I'm doing two is because we have a positive and a negative a positive text encoding or a positive conditioning is basically the things that I want the model to include in the image the negative is the things that I want the model to exclude from the image it is called conditioning
19:00 - 19:30 because it gives the model the conditions to generate the image and it gives the model the conditions not to generate the image if you want you can look at it like that another word that you can sometimes hear instead of text encoding or conditioning is the text embedding So eventually what the clip does is generating a text embedding which is coded language that the model will know to speak in order to generate the images in order to make the workflow
19:30 - 20:00 a bit nicer we have several tools for our disposal so how can I manipulate and use all those nodes first of all clicking the node title I can move it wherever I want and you can see there are those connections probably if this is the first time you're using comy UI your connections won't be like mine with straight lines but rather be more of a curved lines and you can actually change it here if you go to the settings on the
20:00 - 20:30 right side on the menu and you'll click it on the bottom section you can see that there is a link render mode I like keeping the straight it's nicer to the eye it's more tidy in my point of view if I click spline and I will generate it you can see it actually converts it to lines when there are many many many many lines it's called a spaghetti because there are so many lines that it looks actually like a spaghetti but it's up to your own
20:30 - 21:00 preferences once you're done with the workflow you can completely hide the lines which is also kind of a nice way to keep uh things tidy up so uh if you click hide um you won't see them at all but this in this case you don't really see the connections between the nodes and um lineer is straight lines directly straight lines it's also kind of um messy I like best the straight ones but
21:00 - 21:30 you know each one has its own preferences uh this one helps me keep things more tidy and besides moving it I can also click here the circle and it makes the nodes folded so I can fold the nodes but in order to understand what each node is in my workflow I can basically rename the title so I can simply right click and here on the title
21:30 - 22:00 just type whatever I want so I can say that this is the positive prompt and I can right click here on the title and say that this is the negative prompt this way on my workflow I can clearly understand what each node does not only that I can even give it a coloring so if I right click and go to colors I can make it green and I can make this one red it's a good practice
22:00 - 22:30 to color the positive prompts green and the negative prompts red because it really shows immediately and you don't really have to think about what this text means so this is the minimum usually that I will do on workflows and also not always it depends if I'm building this workflow for a onetime uh process or is it something that I'm going to use more and more and more
22:30 - 23:00 another cool thing about uh comy Y is that it allows you to take these nodes let's say this this is something I want to use again and again can simply select all those nodes using shift I can just click one hold shift and click the others and then I can just right click and convert to a group node I can give this group uh name prompts for example and basically it will take all those uh noes together
23:00 - 23:30 into one big node that you can see it now contains the positive the negative prompt text and also the checkpoints loader and now it has all the outputs of all the nodes combined I will click contrl Zed to undo the changes I don't really like the combined uh view when I'm still working on the mode because there are a lot of times that I want to move it to another location and see how
23:30 - 24:00 it impacts things so this is it for now if you have a lot of noes I can also choose this one and this one I can just right click and align to left and it will will not really seeing what it did and if I'll do align to left it will actually make both nodes on the same line it will align them to the left and so on so as we go on you will see some more and more things that that I'm doing on the bottom right side you can see
24:00 - 24:30 that you can resize your nodes to whatever size you need and this way you can actually keep your workflow very tidy and very nice so now I have my positive prompt and my negative prompt and you can see that we have something called conditioning so what is the conditioning as we said it's what tells the model what to include in the image what not to include in the image and it does it in some kind of a
24:30 - 25:00 encoding where it takes the text and encodes it to a language that the sampler can understand so once we drag the conditioning you will see that already it recommends me to use a k sampler and here you can see that the sampler have several inputs the positive which is the positive conditioning or the positive prompt and it also got a negative so we need to
25:00 - 25:30 take the negative and drag it from here to here another thing that it accepts is the model so the model we just drag the model from here to here and you can see already it pulls the line right in here which kind of bugs me because I'm not really seeing where it connects one thing that you can do you can actually do something like that and you can see that you have here reroute if you just click reroute it will actually create a
25:30 - 26:00 break and then I can go from here to here and connect it like so so now it gives me some kind of a control about where I want to connect the image from so now I can clearly see that the model connects to the sampler you not have to do it I just like things that are a bit tigher so now we have the sampler that accepts a Model A positive prompt a negative prompt and it needs a latent
26:00 - 26:30 image what is the lat latent image can just click and drag either from the output or for the input and leave it and you will see that it will give me this empty latent image empty latent image actually tells the model to start from scratch start from an empty image without any information on it what does it mean yes it means that in future workflows we will see that I can take
26:30 - 27:00 not an empty latent image but start from an already generated image and use the model to change and alter this image in this case we're taking an EMP empty latent image and it receives a width and a height this is basically the size of the output image that I want to generate SD XL works best on 768 by 768 it is really important to keep keep the size of width and height in a powers
27:00 - 27:30 of two so don't go and generate something like uh 753 uh of width like you said we're going to do 768 you can do 1024 time half and it will give you the 512 and so on so you can do the basic ma mathematics here it's it's very simple and very easy and the bat size is the number of images I want to generate in one goal I usually keep it to one
27:30 - 28:00 because in comi we can actually generate four Images one after another and that way it doesn't load too much on the GPU so now we have an empty canvas or an empty latent image and we have the positive prompt and we have the negative prompt everything is connected to the sampler and now we need the sampler to take all this information process it using these parameters that we're we going to go over them in a minute and
28:00 - 28:30 output a latent a latent is basically in this case what the sampler managed to extract from our latent space if you want it managed to extract everything and generate an image step by step a bit by bit until it completed the entire process and gave us the final image that it generated once we drag our cursor from the latent it will give us vae
28:30 - 29:00 decode what vae decode actually does is take the latent information from the sampler and it transforms it into an actual image you can see here that we have something called vae so vae is what takes the information from the sampler and knows how to turn it back from a latent which is made of codes if you remember what we said before that the text becomes a code with
29:00 - 29:30 the encoder into the sampler that can now use the empty latent area and take the text encodings or the conditioning and using the model and its weights convert it into a representation of an image which is called the latent what the vae decoder can do it takes a vae in this case we don't have a specific Vee later on in future videos we'll see how
29:30 - 30:00 we can load custom Vees but for now we're going to use the default vae of the model so we're just going to do the same here create a reroute and we're going to connect it here so let's here do another rear route and then connect it here so we can actually connect it here and what it does it knows to take
30:00 - 30:30 the information from the sampler and using the formula or the algorithm that is defined in the vae file and converts it into an image so the output of the vae decoder is an image and here we're going to Output a preview image so let's go over quickly on the workflow we are loading a checkpoint which is is the actual model inside a file called sdxl
30:30 - 31:00 based safe tensors this extracts the model the clip and the vae the model connects directly to the sampler the clip is being extracted to what we call a text encoder in which case we have a positive prompt and a negative prompt that generates conditioning these conditionings tell the sampler how to convert the texts into an image and imprint it on an empty
31:00 - 31:30 latent image meaning an empty canvas that the sampler can take and construct the information on it then all this information that the sampler is generating being sent into a vae decoder that using the vae file knows how to decode the code that the sampler gen erated into an actual image and this
31:30 - 32:00 image is what we are going to preview so now we have a dog running on a road let's just click Q prompt and see what happens so you can see that the workflow starts with loading the checkpoint you can see that it turns green meaning that it's working it completed loading the checkpoint so it went to the positive and negative and the empty latent image and now it's working with the sampler you can see the progress bar here and we said the we
32:00 - 32:30 said to the sampler to use 20 steps to generate our latent image or our latent coding later on we'll understand what it actually means and it generates it using a sid which is a random number generated by the node itself once the sampler is complete it runs the code into the vae decoder and the V vae decoder using the vae file vae files generate an actual
32:30 - 33:00 image so here you can see the output and if I click anywhere on the canvas that is not an actual node you can move the space around and using the scroll the scroll button of the mouth anywhere that is not a node will actually um um zoom in and zoom out so you can see it generated a very nice image of a
33:00 - 33:30 dog now let's focus a bit on the sampler and understand what what each of the parameter means so the seed and the control after generate is what actually decides the randomization of the generated images if I will change here the control F to generate to fixed and for example here I will put the number one and I will C the Q prompt you will
33:30 - 34:00 see that the image is being generated and you can see that the image were generated if I click the Q prompt again nothing happens and that's because that confi in a Smart Way keeps the output of each node and if nothing changes it doesn't rerun it so now I can actually decide the number of steps that I want to generate the the image if I'll do one step it will be a very weird
34:00 - 34:30 image that is very noisy and the the more steps I will guide the sampler to process the clearer and the clearer the image will become and now if I'll go to 10 you will see that the image becomes much more coherent and much more finalized the CFG decides how close I want the sampler to stick to the prompt so if I'll go to to a very low number it will actually change the image
34:30 - 35:00 and it will make it less coherent to The Prompt that I've written a good value for CFG for General models would be somewhere around 4 and A2 and 7even so on 4 and A2 if we'll run the image you will see that we get a pretty clear and nice image and with 10 steps we can already see where the image is going if we want to generate more and more details we'll simply introduce more
35:00 - 35:30 steps to the sampler so let's say now that instead of 10 images I like the concept and I can say 20 the 20 images will now generate a much clearer and much more detailed image actually changed the image completely and if you go to the view history here on the right you can actually load the last generations with all the settings that were used so you can see that it started with a German Shepherd and it became something like a
35:30 - 36:00 doberman is now the more steps I'll introduce to the sampler the more it will refine the results until at some point it will have such a minor effect that it would be useless to to use so many steps so you can see that now it actually started destructing and ruining the image so this is the trial and error based on the
36:00 - 36:30 prompt that you generated and the image that you're striving to achieve you need to play with the number of steps I usually start with 12 to 14 steps to see if the general composition is something that I like and if it's something that I like I continue from there so now let's do randomize which means that every time I click the Q prompt it will generate a random seed and the image will be different I accidentally clicked okay here but it doesn't matter CU it already
36:30 - 37:00 started running and if we'll click the Q prompt again you will see that it generates a new seed every time it completes the flow so it runs on this number and generates a new number if you want to remember the last seed that you generated you can just go to the history and click the load so this is not an image I want I can click you prompt again and this is a nicer image that I
37:00 - 37:30 want to work with it in order to understand the sampler name and the scheduler let's first understand how the sampler actually works what a sampler does as its name suggests it actually create samples it takes the conditioning and it takes the model and it takes the empty latent image that we created or an existing one it doesn't really matter and it generates a noise on the first
37:30 - 38:00 place and then on each step which is decided by the number of steps how many steps it will do the process in each step it tries to sample the noise into an actual image so every time I run another step it takes the result of the previous step and clear its noise the function that is called D noise actually tells the sampler how
38:00 - 38:30 much noise we want to clean up so D noise one actually results in an image with no noise later on we will see how we use it in order to generate images that still have some noise into it and then we can take them and continue the workflow from that point but this is on another video in this video we're going to leave the do noise on the number one and we're going to understand that the sampler name instructs the sampler in
38:30 - 39:00 what algorithm or what method to use in order to decipher or to convert the noise into an actual image and the scheduler decides in what frequency it does the changes so basically you can play with both variables to get different results the uler is the one of the simpler algorithms that is used by Samplers if you want something that is is a bit more detailed you can choose
39:00 - 39:30 the DPM PP or DPM PP uh 2m this is the one that I use quite a lot and as for the scheduler the scheduler of caras usually gives much better results when used with the DPM and if we'll leave everything the same The Prompt is the same the latent image is the same the C is the same the steps are the same and We Run The Prompt right now you you will see that there are some differences or even major differences it depends on the
39:30 - 40:00 uh on the process of the sampler and we will see that it changes the results and as you can see the the colors are much more saturated in this case and the image looks a bit twisted usually it means that we need a few more steps so let's do it with 18 steps so you can see now that the image is a bit clear but the eyes of the dog are a bit bit scary but this is basically it the combination
40:00 - 40:30 of the DPM and the Caris will usually give you a more detailed and more crisp image other than the uler which is a more generic it's it's it's a bit faster the uler so you can use it as a as a tool to understand the com the overall composition of the image that you want and then you should start playing with the steps let's try and see what happens if we increase the steps to 24 and run it again and you can see that the image changed a little bit we can load the
40:30 - 41:00 previous one and this one and you can see that there are slight changes but overall the generic composition of the image Remains the Same so depending on the result you're looking for you just need to play with the steps I usually start with a lower number of steps to see if the overall composition is what I I want to achieve and from there I'm trying to alternate the number numbers going to a fixed seed and then just changing the things for example if now I
41:00 - 41:30 want to take this one and change the dog to be a German Shepherd dog and I click the CU and as you can see the composition pretty much remained the same but the dog itself was changed it didn't quite manage to render its body but I just wanted to show you this idea another small thing that you can do let's say that you've generated a workflow that you really like you can click the save for it will pop up for a
41:30 - 42:00 name and this will be the demo one and then on the next time you want to load a workflow just click the load and you can go and load the Json file and it will load the images and the settings and the parameters that you've just created thank you for watching the video I really hope you found it useful and you liked it if so please hit a like hit subscribe and see you on the next video