![]() | submitted by /u/-Ellary- [link] [comments] |
Vista de Lectura
Fluffy Kontext
RetroVHS Mavica-5000 - Flux.dev LoRA
![]() | I lied a little: it’s not pure VHS – the Sony ProMavica MVC-5000 is a still-video camera that saves single video frames to floppy disks. Yep, it’s another VHS-flavored LoRA—but this isn’t the washed-out like 2000s Analog Cores. Think ProMavica after a spa day: cleaner grain, moodier contrast, and even the occasional surprisingly pretty bokeh. The result lands somewhere between late-’80s broadcast footage and a ‘90s TV drama freeze-frame — VHS flavour, minus the total mud-bath. Why bother? • More cinematic shadows & color depth. • Still keeps that sweet lo-fi noise, chroma wiggle, and subtle smear, so nothing ever feels too modern. • Low-dynamic-range pastel palette — cyan shadows, magenta mids, bloom-happy highlights You can find LoRA here: https://civitai.com/models/1738734/retrovhs-mavica-5000 P.S.: i plan to adapt at least some of my loras to Flux Kontext in the near future [link] [comments] |
Homemade SD1.5 major update 1❗️
![]() | I’ve made some major improvement to my custom mobile homemade SD1.5 model. All the pictures I uploaded were created purely by the model without using any loras or addition tools. All the training and pictures I uploaded were made using my phone. I have a Mac mini m4 16gb on the way so I’m excited to push the model even further. Also I’m almost done fixing the famous hand/finger issue that sd1.5 is known for. I’m striving to make it or get as close to Midjourney as I can in term of capability. [link] [comments] |
OmniAvatar released the model weights for Wan 1.3B!
![]() | OmniAvatar released the model weights for Wan 1.3B! For those who don't know, Omnigen is an improved model based on fantasytalking - Github here: https://github.com/Omni-Avatar/OmniAvatar We still need a ComfyUI implementation for this, as to this point, there are no native ways to run Audio-Driven Avatar Video Generation on Comfy. Maybe the great u/Kijai can add this to his WAN-Wrapper, maybe? The video is not mine, it's from user nitinmukesh who posted it here: https://github.com/Omni-Avatar/OmniAvatar/issues/19, along with more info, PS. he ran it with 8GB VRAM [link] [comments] |
Chattable Wan & FLUX knowledge bases
![]() | I used NotebookLM to make chattable knowledge bases for FLUX and Wan video. The information comes from the Banodoco Discord FLUX & Wan channels, which I scraped and added as sources. It works incredibly well at taking unstructured chat data and turning it into organized, cited information! Links: 🔗 FLUX Chattable KB (last updated July 1) You can ask questions like:
or for Wan:
Thanks to the Banodoco community for the vibrant, in-depth discussion. 🙏🏻 It would be cool to add Reddit conversations to knowledge bases like this in the future. Tools and info if you'd like to make your own:
[link] [comments] |
Local image processing for garment image enhancement
![]() | Looking for a locally run image processing solution to tidy up photos of garments like the attached images, any and all suggestions welcome, thank you. [link] [comments] |
Universal Method for Training Kontext Loras without having to find pairs of images or edit
![]() | So, the problem with Flux Kontext is that it needs pairs of images. For example, if you want to train an oil painting you would need a photo of a place + a corresponding painting. It can be slow and laborious to edit or find pairs of images. BUT - it doesn't have to be that way. 1) Get the images in the style you want. For example, Pixar Disney style. 2) Use Flux Kontext to convert these images to a style that Flux Kontext's basic model already knows. For example, cartoon. So, you will train a Lora on a pair of Pixar images + Pixar converted to cartoon. 3) After Lora is trained. Choose any image. Photo of New York City. Use Flux Kontext to convert this photo to cartoon. 4) Lastly, apply Lora to the cartoon photo of New York City This is a hypothetical method [link] [comments] |
Anyone else struggling with hands in image generation? Using FLUX.1 on Perchance...
Hey everyone,
So I’ve been messing around with this open-source image generator based on FLUX.1-schnell—I'm using it through Perchance over here: https://perchance.org/ai-text-to-image-generator — and man, I know rendering hands has always been a weak point in AI art, but this model really seems to have a vendetta against fingers.
Like, I get it. Every model has its quirks. But with this one, even when the rest of the image looks solid—composition, lighting, facial structure, clothing, all looking tight—the hands end up looking like a cursed pretzel made of flesh. Sometimes it's like they're fused together, or there are way too many fingers, or worse... it just blends into some kind of melted glove.
I’ve tried changing the prompts, getting really specific—stuff like “realistic hands,” “five fingers,” “clear hand anatomy,” or even things like “holding a glass with a natural hand pose.” I've also tried negative prompting with “blurry hands,” “distorted fingers,” “mutated limbs,” etc. Still... no luck. Same nightmare fuel.
So, I’m curious:
- Has anyone found any prompt tricks that consistently help with hands in FLUX.1 or similar models?
- Is there some sort of setting or seed manipulation I should be looking into?
- Or is this just the nature of the beast with this specific model?
I know Stable Diffusion and others have a bunch of tools and add-ons like ControlNet or inpainting that help, but since I’m using this one directly through Perchance, I don’t think there’s a lot of backend flexibility. Still, if anyone knows a way to get halfway-decent hands—or at least stop them from merging into creepy finger-blobs—I’m all ears.
Thanks in advance. Just trying to keep my generated people from looking like they’ve been sculpted by a drunk octopus.
[link] [comments]
A huge thanks to the nunchaku team.
I just wanted to say thank you. Nunchaku looks like magic, for real. I went from 9.5 s/it on my 8GB 4070 iGPU to 1.5 s/it.
I tried pluggin in my 3090 eGPU, and it stands at 1 s/it, so a full sized 3090 is just marginally faster than a laptop gpu with one third of the VRAM.
I really hope all future models will implement this, it really looks like black magic.
EDIT: it was s/it, not it/s
[link] [comments]
MediaSyncer - Easily play multiple videos/images at once in sync! Great for comparing generations. Free and Open Source!
![]() | https://whatdreamscost.github.io/MediaSyncer/ I made this media player last night (or mainly AI did) since I couldn't find a program that could easily play multiple videos in sync at once. I just wanted something I could use to quickly compare generations. It can't handle many large 4k video files (it's a very basic program), but it's good enough for what I needed it for. If anyone wants to use it there it is, or you can get a local version here https://github.com/WhatDreamsCost/MediaSyncer [link] [comments] |
Need help catching up. What’s happened since SD3?
Hey, all. I’ve been out of the loop since the initial release of SD3 and all the drama. I was new and using 1.5 up to that point, but moved out of the country and fell out of using SD. I’m trying to pick back up, but it’s been over a year, so I don’t even know where to be begin. Can y’all provide some key developments I can look into and point me to the direction of the latest meta?
[link] [comments]
SimpleTuner v2.0.1 with 2x Flux training speedup on Hopper + Blackwell support now by default
https://github.com/bghira/SimpleTuner/releases/tag/v2.0.1
Also, now you can use Huggingface Datasets more directly, as it has its own defined databackend type, a caching layer, and fully integrated into the dataloader config pipeline such that you can cache stuff to s3 buckets or local partition, as usual.
Some small speed-ups for S3 dataset loading w/ millions of samples.
Wan 14B training speedups to come soon.
[link] [comments]
Wan 2.1 pixelated eyes
Hi guys,
I have a RTX 3070 Ti so only working with low 8 GB VRAM with Wan 2.1 + Self Forcing.
I generate it with: - 81 frames - 640 x 640 - CFG 1 - Steps 4
The eyes always lose quality post-render. Is there anyway for me to fix this? Or is it really just about more VRAM to run at 1280 x 1280 or above to keep eye quality?
Thanks
[link] [comments]
Live Face Swap and Voice Cloning(Improvements/Update)
![]() | Hey guys! A couple days ago, I shared a live zero shot face swapping and voice conversion project, but I thought it would be nice to let you guys know I made some big improvements on the quality of the faceswap through some pre/post processing steps. Hope you guys enjoy the project and the little demo below . Link: https://github.com/luispark6/DoppleDanger [link] [comments] |
Really high s/it when training Lora
I'm really struggling here to generate a Lora using Musibi and Hunyuan Models.
When using the --fp8_base flags and models I am getting 466s/it
When using the normal (non fp8) models I am getting 200s/it
I am training using an RTX 4070 super 12GB.
I've followed everything here https://github.com/kohya-ss/musubi-tuner to configure it for low VRAM and it seems to run worse than the high VRAM models? It doesn't make any sense to me. Any ideas?
[link] [comments]