Sdxl base vs refiner reddit This is an answer that someone corrects. 5 better, it'll do the same to SDXL. SDXL base SDXL Base + refiner. 9 (right) compared to base only, working as Just FYI, 10 steps seems like a lot for the refiner. See "Refinement Stage" in section 2. As you can see, the first picture was made with DreamShaper, all other with SDXL. When using SDXL with lets say DPM++ 3M SDE Exponential sampler at 25-40 steps and cfg of 5, you will always get better results versus using these speed LORA solutions. Set up the refiner sampler, but multiply the base sampler’s step counts by 5. The refiner is a separate model specialized for denoising of 0. The difference is subtle, but noticeable. With SDXL I often have most accurate results with ancestral samplers. 6. 2 or less on "high-quality high resolution" images. The refiner did the remaining 4 steps, so 20% of the job. https://github. However I've switched over to doing 20 full steps on the base with no leftover noise, and then making the refiner add noise for steps 20-24 (so just 4 added steps, with a relatively small amount of noise added), and this gives a decent finished result from the base model while still allowing the refiner to make reasonably adjustments. just using SDXL base to run a 10 step dimm ksampler then converting to image and running it on 1. Overall all I can see is downsides to their openclip model being included at all Understandable, it was just my assumption from discussions that the main positive prompt was for common language such as "beautiful woman walking down the street in the rain, a large city in the background, photographed by PhotographerName" and the POS_L and POS_R would be for detailing such as "hyperdetailed, sharp focus, 8K, UHD" that sort of thing. Jul 28, 2023 · To make full use of SDXL, you'll need to load in both models, run the base model starting from an empty latent image, and then run the refiner on the base model's output to improve detail. 5 of the report on SDXL AFAIK, the VAE is mostly trained on high quality images without watermarks or text in them. Play around with different Samplers and different amount of base Steps (30, 60, 90, maybe even higher). Here's what I've found: When I pair the SDXL base with my LoRA on ComfyUI, things seem to click and work pretty well. 5 with a good model while retaining content, and then taking it back to SDXL for refinement It's a branch from A1111, has had SDXL (and proper refiner) support for close to a month now, is compatible with all the A1111 extensions, but is just an overall better experience, and it's fast with SDXL and a 3060ti with 12GB of ram using both the SDXL 1. Some observations: The SDXL model produces higher quality images. Try DPM++ 2S a Karras, DPM++ SDE Karras, DPM++ 2M Karras, Euler a and DPM adaptive. 20 Steps shouldn't wonder anyone, for Refiner you should use maximum the half amount of Steps you used to generate the picture, so 10 should be max. The base model is perfectly capable of generating an image on its own. In Comfy it' Are you sure you didn't get those refiner and non-refiner pix switched around ? Cause the ones without the refiner look a lot more like ones with the refiner should look. 5 models. That also explain why SDXL Niji SE is so different. My opinion is that it's actually pretty incredible, considering it's a 48:1 compression and can handle quite a lot of normally distributed noise added to the latent stage before you start to see issues in the decoded image. I do 20+4 steps and get good results. When 1. During renders in the official ComfyUI workflow for SDXL 0. I don't know of anyone bothering to do that yet. But if SDXL wants a 11-fingered hand, the refiner gives up. If you're using A111 WebUI and you're at version 1. We still want the refiner to denoise the remaining 20%, but use 20 steps instead of 4. 5 and SD2. I think what people aren’t exactly realizing is that it takes time, resources, and a bit of knowledge to fine tune a checkpoint. Jul 14, 2023 · Here are the base and the base + refiner models. x is the current version) then you should be all Refiners should have at most half the steps that the generation has. On some of the SDXL based models on Civitai, they work fine without using the refiner (though you might need to have the base SDXL refiner downloaded and in place even if it's not being used I'm not 100% sure on that one). Ive had some success using SDXL base as my initial image generator and then going entirely 1. Comparison of using ddim as base sampler and using different schedulers 25 steps on base model (left) and refiner (right) base model I believe the left one has more detail so back to testing comparison grid comparison between 24/30 (left) using refiner and 30 steps on base only Refiner on SDXL 0. Especially on faces. eg using your prompt, L is without refiner, R is with: Yes, I agree with your theory. Second picture is base SDXL, then SDXL + Refiner 5 Steps, then 10 Steps and 20 Steps. Using the base v1. 9 base+refiner, my system would freeze, and render times would extend up to 5 minutes for a single render. Unlike most workflows with partially diffuse using the base model and then hand off to the refiner, I use a workflow that fully diffuses the base model, then adds a bit of noise and diffuses with the refiner. Yes, the base and refiner are totally different models so a LoRA would need to be created specifically for the refiner. So yeah, just like highresfix makes everything in 1. I'm back with another stupidly huge comparison, this time featuring three base models VS one little Juggernaut boi. Most users use fine-tuned v1. It is tuning for Anime like images, which TBH is kind of bland for base SDXL because it was tuned mostly for non The base model did 16 of 20 steps, or 80% of the job. Since I don't know shit about coding as soon as I got the python file working for the SD3 api I copy-pasted it 70 times and manually edited each one. You don't actually need to use the refiner. Refiner tweaks it and adds detail, complexity. The issue with the refiner is simply stabilities openclip model. Unlike SD1. com/wcde/sd-webui-refiner. 0 base and have lots of fun with it. I'd love to run a 1. Really, it’s not easy. I've gone in and set the Preferred VAE for the models I'm using, and tried messing around with the checkboxes and such in the VAE section of the Settings, but I'm still getting VAE glitching when it runs the refiner. 0 and refiner workflow, with diffusers config set up for memory saving. During renders in the As with sdxl-lightning, Hyper SD-XL has some trade offs versus using the base model as is. 1, base SDXL is so well tuned already for coherency that most other fine-tune models are basically only adding a "style" to it. I trained a LoRA model of myself using the SDXL 1. We wanted to make sure it still could run for a patient 8GB VRAM GPU user. The title says it all, but I’ll share my experience. 5 models to generate realistic people. But, as I ventured further and tried adding the SDXL refiner into the mix, things took a turn for the worse. The refiner refines the image making an existing image better. 2. The the base model seem to be tuned to start from nothing, then to get an image. The refiner if used correctly adds fine detail after all. 5 model does not do justice to the v1 models. The main prompt is used for the positive prompt CLIP G model in the base checkpoint and also for the positive prompt in the refiner checkpoint. The refiner model improves rendering details. Easy enough. The base model was trained on the full range of denoising strengths while the refiner was specialized on "high-quality, high resolution data" and denoising of <0. Base sdxl mixes openai clip and openclip, while the refiner is openclip only. The secondary prompt is used for the positive prompt CLIP L model in the base checkpoint. While the base checkpoint has 2 CLIP models CLIP G and CLIP L, the refiner only has CLIP G. 5 model as the refiner, but I'm running into VAE issues. . Thus, it needs to denoise 20/4 = 5 times less per step. 5 or greater (1. Base focuses on overall composition. If you use a LoRA with the base model you might want to skip the refiner because it will probably just degrade the result if it doesn't understand the concept. The refiner is entirely optional and could be used equally well to refine images from sources other than the SDXL base model. Reason we broke up the base and refiner models is because not everyone can afford a nice GPU to make 2048 or 4096 images. 5 models for refining and upscaling. Im interested in trying SDXL for the base image for the better CLIP prompting, and then using ControlNet softedge and/or depth to regenerate the image in SD 1. 5 came out, yeah it was worse than SDXL for the base vs base models. Understandable, it was just my assumption from discussions that the main positive prompt was for common language such as "beautiful woman walking down the street in the rain, a large city in the background, photographed by PhotographerName" and the POS_L and POS_R would be for detailing such as "hyperdetailed, sharp focus, 8K, UHD" that sort of thing. I was using a W10 system with an RTX3060 12GB VRAM and 16GB of RAM. xled dzlhouzp zfh lga uuodnc gudy cjeg vchhlp skijsqnp awygb