submitted by /u/marcoc2 [link] [comments] |
Vista de Lectura
Pixel Art Gif Upscaler
Flux Realism LoRa comparisons!!
So I made a new Flux LoRa for realism (Real Flux Beauty 4.0) and was curious on how it would compare against other realism LoRas. I had way too much fun doing this comparison, lol. Each generation has the same seed, prompts, etc. except for the LoRa strength in which I used the recommendation. All the LoRas are available both at the civitai and tensor art site. [link] [comments] |
I. SEE. YOU.
submitted by /u/PantInTheCountry [link] [comments] |
Moving a heavy box
submitted by /u/Unit2209 [link] [comments] |
A (personal experience) guide to training SDXL LoRas with One Trainer
Hi all,
Over the past year I created a lot of (character) LoRas with OneTrainer. So this guide touches on the subject of training realistic LoRas of humans - a concept already known probably all base models of SD. This is a quick tutorial how I go about it creating very good results. I don't have a programming background and I also don't know the ins and outs why I used a certain setting. But through a lot of testing I found out what works and what doesn't - at least for me. :)
I also won't go over every single UI feature of OneTrainer. It should be self-explanatory. Also check out Youtube where you can find a few videos about the base setup and layout.
1. Prepare Your Dataset (This Is Critical!)
Curate High-Quality Images: Aim for about 50 images, ensuring a mix of close-ups, upper-body shots, and full-body photos. Only use high-quality images; discard blurry or poorly detailed ones. If an image is slightly blurry, try enhancing it with tools like SUPIR before including it in your dataset. The minimum resolution should be 1024x1024.
Avoid images with strange poses and too much clutter. Think of it this way: it's easier to describe an image to someone where "a man is standing and has his arm to the side". It gets more complicated if you describe a picture of "a man, standing on one leg, knees pent, one leg sticking out behind, head turned to the right, doing to peace signs with one hand...". I found that too many "crazy" images quickly bias the data and the decrease the flexibility of your LoRa.
Aspect Ratio Buckets: To avoid losing data during training, edit images so they conform to just 2–3 aspect ratios (e.g., 4:3 and 16:9). Ensure the number of images in each bucket is divisible by your batch size (e.g., 2, 4, etc.). If you have an uneven number of images, either modify an image from another bucket to match the desired ratio or remove the weakest image.
2. Caption the Dataset
Use JoyCaption for Automation: Generate natural-language captions for your images but manually edit each text file for clarity. Keep descriptions simple and factual, removing ambiguous or atmospheric details. For example, replace: "A man standing in a serene setting with a blurred background." with: "A man standing with a blurred background."
Be mindful of what words you use when describing the image because they will also impact other aspects of the image when prompting. For example "hair up" can also have an effect of the persons legs because the word "up" is used in many ways to describe something.
Unique Tokens: Avoid using real-world names that the base model might associate with existing people or concepts. Instead, use unique tokens like "Photo of a df4gf man." This helps prevent the model from bleeding unrelated features into your LoRA. Experiment to find what works best for your use case.
3. Configure OneTrainer
Once your dataset is ready, open OneTrainer and follow these steps:
Load the Template: Select the SDXL LoRA template from the dropdown menu.
Choose the Checkpoint: Train using the base SDXL model for maximum flexibility when combining it with other checkpoints. This approach has worked well in my experience. Other photorealistic checkpoints can be used as well but the results vary when it comes to different checkpoints.
4. Add Your Training Concept
Input Training Data: Add your folder containing the images and caption files as your "concept."
Set Repeats: Leave repeats at 1. We'll adjust training steps later by setting epochs instead.
Disable Augmentations: Turn off all image augmentation options in the second tab of your concept.
5. Adjust Training Parameters
Scheduler and Optimizer: Use the "Prodigy" scheduler with the "Cosine" optimizer for automatic learning rate adjustment. Refer to the OneTrainer wiki for specific Prodigy settings.
Epochs: Train for about 100 epochs (adjust based on the size of your dataset). I usually aim for 1500 - 2600 steps. It depends a bit on your data set.
Batch Size: Set the batch size to 2. This trains two images per step and ensures the steps per epoch align with your bucket sizes. For example, if you have 20 images, training with a batch size of 2 results in 10 steps per epoch.
6. Set the UNet Configuration
Train UNet Only: Disable all settings under "Text Encoder 1" and "Text Encoder 2." Focus exclusively on the UNet.
Learning Rate: Set the UNet training rate to 1.
EMA: Turn off EMA (Exponential Moving Average).
7. Additional Settings
Sampling: Generate samples every 10 epochs to monitor progress.
Checkpoints: Save checkpoints every 10 epochs instead of relying on backups.
LoRA Settings: Set both "Rank" and "Alpha" to 32.
Optionally, toggle on Decompose Weights (DoRa) to enhance smaller details. This may improve results, but further testing might be necessary. So far I've definitely seen improved results.
Training images: I specifically use prompts that describe details that doesn't appear in my training data, for example different background, different clothing, etc.
8. Start Training
- Begin the training process and monitor the sample images. If they don’t start resembling your subject after about 20 epochs, revisit your dataset or settings for potential issues. If your images start out grey, weird and distorted from the beginning, something is definitely off.
Final Tips:
Dataset Curation Matters: Invest time upfront to ensure your dataset is clean and well-prepared. This saves troubleshooting later.
Stay Consistent: Maintain an even number of images across buckets to maximize training efficiency. If this isn’t possible, consider balancing uneven numbers by editing or discarding images strategically.
Overfitting: I noticed that it isn't always obvious that a LoRa got overfitted while training. The most obvious indication are distorted faces but in other cases the faces look good but the model is unable to adhere to prompts that require poses outside the information of your training pictures. Don't hesitate to try out saves of lower Epochs to see if the flexibility is as desired.
Happy training!
[link] [comments]
Kijai has updated the CogVideoXWrapper: Support for 1.5! Refactored with simplified pipelines, and extra optimizations. (but breaks old workflows)
https://github.com/kijai/ComfyUI-CogVideoXWrapper notes for this update: This is big one, and unfortunately to do the necessary cleanup and refactoring this will break every old workflow as they are. I apologize for the inconvenience, if I don't do this now I'll keep making it worse until maintaining becomes too much of a chore, so from my pov there was no choice. Please either use the new workflows or fix the nodes in your old ones before posting issue reports! Old version will be kept in a legacy branch, but not maintained
For Fun -model based workflows it's more drastic change, for others migrating generally means re-setting many of the nodes. [link] [comments] |
Concept art prompt sugestions
Hey, guys. Interested in everyone's suggestions. I'm looking for good prompts to generate various concept art of props. Different weapons, interior items, etc. Primarily interested in fantasy and sci-fi styles. If we talk about fantasy styles, I am mainly interested in stylization similar to Hearthstone, WoW, LoL. If we talk about sci-fi, then more smooth, similar to the way Overwatch, Destiny, Starfield look. Or even if you have good suggestions for any concept art prompts, Lora, checkpoints, I'd love to hear them. I'll attach examples of what I'm ideally looking for. Also, if you know of posts where similar questions have been asked before, I'd be glad if you'd send links. [link] [comments] |
Extending a Video with a Text Prompt
Hi everyone!
I have a short video I recorded on my phone, and I’d like to extend it using a text-based prompt.
For example, in the video, my friend is walking down a path, and I’d love to extend it by adding a scene where he trips and falls because of a stone.
Are there any existing tools or software that can help with this? I’m open to suggestions for AI-based tools or creative editing apps that make this process easy.
Thanks in advance!
[link] [comments]
Forge UI in to Open AI
hey guys so im trying to put forge ui into open ai but i dont know how and cant find anything online, when i look in open ai i can see only 3 options to chose from 1. automatic1111 2. comfyui 3. openai. but maybe theres a way to still get forge ui in there somehow? [link] [comments] |
Made my first LoRA for Flux: asking for expert feedback!
Hello everyone, I wanted to share with you the LoRA I trained, called 'Flux-American-Dollar-Bills'. In the post where the LoRA is linked, you will find a detailed description of the work done and the steps that led me to the final result. Since this is my first project, I would really appreciate receiving feedback from those with more experience in LoRA creation, so I can get properly oriented before starting a new project. I would like to understand what I can optimize or do differently, possibly following a more effective workflow and strategy. Any advice or suggestions are welcome!
[link] [comments]
Lora for flux dev
Im using flux dev and i need lora to generate super realistic bold pictures so which lora is best?
[link] [comments]
Fine tune a model or train a large LoRA online
Soon I will have a big tagged dataset (~2000 images) that I want to train a SDXL model or Lora.
What is better? LoRA or fine tune for a project of this size. I want to teach it roughly 20 concepts.
Civitai is amazing for LoRA training, but they have an image limit of 1000 and a step limit of 10000. My understanding is that the bigger your dataset the more training steps you need. Is it true? I plan on something around 30k steps.
Can anyone reccomend an online platform that is easy to use for such a task? I heard good things about runpod, but maybe better alternatives exist?
[link] [comments]
Analyze a LoRA merge
Is there a script that can analyze a LoRA merge (in safetensors format, of unknown origin) and generate a detailed report with:
- The trigger words or concepts it was trained on.
- A count of occurrences, sorted in ascending or descending order.
I’m looking for a tool that provides a clear and well-organized report.
Would you be able to suggest what type of script would be useful? What data (and in what format) should the script look for inside the safetensor to generate this type of report?
Thanks in advance for any suggestions!
[link] [comments]
combined workflow of SD 3.5 Large + Flux Dev with Multi GPU support
I created a combined workflow which consists of SD 3.5 Large + Flux Dev.
It gets a single prompt and generates image from both SD 3.5 Large and Flux Dev.
It needs custom nodes :
https://github.com/neuratech-ai/ComfyUI-MultiGPU
https://github.com/pythongosssss/ComfyUI-Custom-Scripts
The top pipeline is for SD and bottom one is for Flux.
The left most node contains main prompt.
You can load the models, clips and VAE on different GPUs by changing the CUDA device no. (setting 0 to all will unload model sometimes if memory is low, but works if above 24 GB VRAM)
https://gist.github.com/RageshAntonyHM/d109447a40bcb54a1c6e9553078fbbcc
HELP:
if you know how to properly arrange the nodes, kindly please do that and share it in comments.
[link] [comments]