I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

StableDiffusion

／u／XMasterrrr

2 Julio 2025 at 19:47

I Built My Wife a Simple Web App for Image Editing Using Flux Kontext—Now It’s Open Source

submitted by /u/XMasterrrr
[link] [comments]

Fluffy Kontext

StableDiffusion

／u／-Ellary-

3 Julio 2025 at 13:54

submitted by /u/-Ellary-
[link] [comments]

RetroVHS Mavica-5000 - Flux.dev LoRA

StableDiffusion

／u／FortranUA

2 Julio 2025 at 21:03

I lied a little: it’s not pure VHS – the Sony ProMavica MVC-5000 is a still-video camera that saves single video frames to floppy disks.

Yep, it’s another VHS-flavored LoRA—but this isn’t the washed-out like 2000s Analog Cores. Think ProMavica after a spa day: cleaner grain, moodier contrast, and even the occasional surprisingly pretty bokeh. The result lands somewhere between late-’80s broadcast footage and a ‘90s TV drama freeze-frame — VHS flavour, minus the total mud-bath.

Why bother?

• More cinematic shadows & color depth.

• Still keeps that sweet lo-fi noise, chroma wiggle, and subtle smear, so nothing ever feels too modern.

• Low-dynamic-range pastel palette — cyan shadows, magenta mids, bloom-happy highlights

You can find LoRA here: https://civitai.com/models/1738734/retrovhs-mavica-5000

P.S.: i plan to adapt at least some of my loras to Flux Kontext in the near future

submitted by /u/FortranUA
[link] [comments]

Homemade SD1.5 major update 1❗️

StableDiffusion

／u／darlens13

3 Julio 2025 at 04:49

I’ve made some major improvement to my custom mobile homemade SD1.5 model. All the pictures I uploaded were created purely by the model without using any loras or addition tools. All the training and pictures I uploaded were made using my phone. I have a Mac mini m4 16gb on the way so I’m excited to push the model even further. Also I’m almost done fixing the famous hand/finger issue that sd1.5 is known for. I’m striving to make it or get as close to Midjourney as I can in term of capability.

submitted by /u/darlens13
[link] [comments]

OmniAvatar released the model weights for Wan 1.3B!

StableDiffusion

／u／younestft

3 Julio 2025 at 14:28

OmniAvatar released the model weights for Wan 1.3B!
To my knowledge, this is the first talking avatar project to release a 1.3b model that can be run with consumer-grade hardware of 8GB VRAM+

For those who don't know, Omnigen is an improved model based on fantasytalking - Github here: https://github.com/Omni-Avatar/OmniAvatar

We still need a ComfyUI implementation for this, as to this point, there are no native ways to run Audio-Driven Avatar Video Generation on Comfy.

Maybe the great u/Kijai can add this to his WAN-Wrapper, maybe?

The video is not mine, it's from user nitinmukesh who posted it here: https://github.com/Omni-Avatar/OmniAvatar/issues/19, along with more info, PS. he ran it with 8GB VRAM

submitted by /u/younestft
[link] [comments]

Chattable Wan & FLUX knowledge bases

StableDiffusion

／u／AtreveteTeTe

3 Julio 2025 at 12:46

I used NotebookLM to make chattable knowledge bases for FLUX and Wan video.

The information comes from the Banodoco Discord FLUX & Wan channels, which I scraped and added as sources. It works incredibly well at taking unstructured chat data and turning it into organized, cited information!

Links:

🔗 FLUX Chattable KB (last updated July 1)
🔗 Wan 2.1 Chattable KB (last updated June 18)

You can ask questions like:

How does FLUX compare to other image generators?
What is FLUX Kontext?

or for Wan:

What is VACE?
What settings should I be using for CausVid? What about kijai's CausVid v2?
Can you give me an overview of the model ecosytem?
What do people suggest to reduce VRAM usage?
What are the main new things people discussed last week?

Thanks to the Banodoco community for the vibrant, in-depth discussion. 🙏🏻

It would be cool to add Reddit conversations to knowledge bases like this in the future.

Tools and info if you'd like to make your own:

I'm using DiscordChatExporter to scrape the channels.
discord-text-cleaner: A web tool to make the scraped text lighter by removing {Attachment} links that NotebookLM doesn't need.
More information about my process on Youtube here, though now I just directly download to text instead of HTML as shown in the video. Plus you can set a partition size to break the text files into chunks that will fit in NotebookLM uploads.

submitted by /u/AtreveteTeTe
[link] [comments]

The Single most POWERFUL PROMPT made possible by flux kontext revealed!

StableDiffusion

／u／roychodraws

2 Julio 2025 at 16:44

"Remove Watermark."

submitted by /u/roychodraws
[link] [comments]

Local image processing for garment image enhancement

StableDiffusion

／u／kkyson

3 Julio 2025 at 12:47

Local image processing for garment image enhancement

Looking for a locally run image processing solution to tidy up photos of garments like the attached images, any and all suggestions welcome, thank you.

submitted by /u/kkyson
[link] [comments]

Universal Method for Training Kontext Loras without having to find pairs of images or edit

StableDiffusion

／u／More_Bid_2197

3 Julio 2025 at 01:49

Universal Method for Training Kontext Loras without having to find pairs of images or edit

So, the problem with Flux Kontext is that it needs pairs of images. For example, if you want to train an oil painting you would need a photo of a place + a corresponding painting.

It can be slow and laborious to edit or find pairs of images.

BUT - it doesn't have to be that way.

1) Get the images in the style you want. For example, Pixar Disney style.

2) Use Flux Kontext to convert these images to a style that Flux Kontext's basic model already knows. For example, cartoon.

So, you will train a Lora on a pair of Pixar images + Pixar converted to cartoon.

3) After Lora is trained. Choose any image. Photo of New York City. Use Flux Kontext to convert this photo to cartoon.

4) Lastly, apply Lora to the cartoon photo of New York City

This is a hypothetical method

submitted by /u/More_Bid_2197
[link] [comments]

Anyone else struggling with hands in image generation? Using FLUX.1 on Perchance...

StableDiffusion

／u／Scorpio479

3 Julio 2025 at 13:52

Hey everyone,

So I’ve been messing around with this open-source image generator based on FLUX.1-schnell—I'm using it through Perchance over here: https://perchance.org/ai-text-to-image-generator — and man, I know rendering hands has always been a weak point in AI art, but this model really seems to have a vendetta against fingers.

Like, I get it. Every model has its quirks. But with this one, even when the rest of the image looks solid—composition, lighting, facial structure, clothing, all looking tight—the hands end up looking like a cursed pretzel made of flesh. Sometimes it's like they're fused together, or there are way too many fingers, or worse... it just blends into some kind of melted glove.

I’ve tried changing the prompts, getting really specific—stuff like “realistic hands,” “five fingers,” “clear hand anatomy,” or even things like “holding a glass with a natural hand pose.” I've also tried negative prompting with “blurry hands,” “distorted fingers,” “mutated limbs,” etc. Still... no luck. Same nightmare fuel.

So, I’m curious:

Has anyone found any prompt tricks that consistently help with hands in FLUX.1 or similar models?
Is there some sort of setting or seed manipulation I should be looking into?
Or is this just the nature of the beast with this specific model?

I know Stable Diffusion and others have a bunch of tools and add-ons like ControlNet or inpainting that help, but since I’m using this one directly through Perchance, I don’t think there’s a lot of backend flexibility. Still, if anyone knows a way to get halfway-decent hands—or at least stop them from merging into creepy finger-blobs—I’m all ears.

Thanks in advance. Just trying to keep my generated people from looking like they’ve been sculpted by a drunk octopus.

submitted by /u/Scorpio479
[link] [comments]

A huge thanks to the nunchaku team.

StableDiffusion

／u／Tomorrow_Previous

2 Julio 2025 at 19:02

I just wanted to say thank you. Nunchaku looks like magic, for real. I went from 9.5 s/it on my 8GB 4070 iGPU to 1.5 s/it.
I tried pluggin in my 3090 eGPU, and it stands at 1 s/it, so a full sized 3090 is just marginally faster than a laptop gpu with one third of the VRAM.
I really hope all future models will implement this, it really looks like black magic.

EDIT: it was s/it, not it/s

submitted by /u/Tomorrow_Previous
[link] [comments]

MediaSyncer - Easily play multiple videos/images at once in sync! Great for comparing generations. Free and Open Source!

StableDiffusion

／u／WhatDreamsCost

2 Julio 2025 at 20:23

MediaSyncer - Easily play multiple videos/images at once in sync! Great for comparing generations. Free and Open Source!

https://whatdreamscost.github.io/MediaSyncer/

I made this media player last night (or mainly AI did) since I couldn't find a program that could easily play multiple videos in sync at once. I just wanted something I could use to quickly compare generations.

It can't handle many large 4k video files (it's a very basic program), but it's good enough for what I needed it for. If anyone wants to use it there it is, or you can get a local version here https://github.com/WhatDreamsCost/MediaSyncer

submitted by /u/WhatDreamsCost
[link] [comments]

Need help catching up. What’s happened since SD3?

StableDiffusion

／u／DystopiaLite

2 Julio 2025 at 21:55

Hey, all. I’ve been out of the loop since the initial release of SD3 and all the drama. I was new and using 1.5 up to that point, but moved out of the country and fell out of using SD. I’m trying to pick back up, but it’s been over a year, so I don’t even know where to be begin. Can y’all provide some key developments I can look into and point me to the direction of the latest meta?

submitted by /u/DystopiaLite
[link] [comments]

SimpleTuner v2.0.1 with 2x Flux training speedup on Hopper + Blackwell support now by default

StableDiffusion

／u／terminusresearchorg

3 Julio 2025 at 03:07

https://github.com/bghira/SimpleTuner/releases/tag/v2.0.1

Also, now you can use Huggingface Datasets more directly, as it has its own defined databackend type, a caching layer, and fully integrated into the dataloader config pipeline such that you can cache stuff to s3 buckets or local partition, as usual.

Some small speed-ups for S3 dataset loading w/ millions of samples.

Wan 14B training speedups to come soon.

submitted by /u/terminusresearchorg
[link] [comments]

Wan 2.1 pixelated eyes

StableDiffusion

／u／KeijiVBoi

3 Julio 2025 at 10:51

Hi guys,

I have a RTX 3070 Ti so only working with low 8 GB VRAM with Wan 2.1 + Self Forcing.

I generate it with: - 81 frames - 640 x 640 - CFG 1 - Steps 4

The eyes always lose quality post-render. Is there anyway for me to fix this? Or is it really just about more VRAM to run at 1280 x 1280 or above to keep eye quality?

Thanks

submitted by /u/KeijiVBoi
[link] [comments]

Sparc3D Model + Hunyuan 2.1 for the texturing

StableDiffusion

／u／smereces

2 Julio 2025 at 17:22

Sparc3D Model + Hunyuan 2.1 for the texturing

submitted by /u/smereces
[link] [comments]

Live Face Swap and Voice Cloning(Improvements/Update)

StableDiffusion

／u／Single-Condition-887

2 Julio 2025 at 20:44

Live Face Swap and Voice Cloning(Improvements/Update)

Hey guys! A couple days ago, I shared a live zero shot face swapping and voice conversion project, but I thought it would be nice to let you guys know I made some big improvements on the quality of the faceswap through some pre/post processing steps. Hope you guys enjoy the project and the little demo below . Link: https://github.com/luispark6/DoppleDanger

https://reddit.com/link/1lq6ty9/video/tb7i9s60wiaf1/player

submitted by /u/Single-Condition-887
[link] [comments]

Automated illustration of a Conan story using language models + flux and other local models

StableDiffusion

／u／RobertTetris

3 Julio 2025 at 00:00

Automated illustration of a Conan story using language models + flux and other local models

https://brianheming.substack.com/p/making-illustrated-conan-adventures-039

submitted by /u/RobertTetris
[link] [comments]

Really high s/it when training Lora

StableDiffusion

／u／coolsimon123

3 Julio 2025 at 11:15

I'm really struggling here to generate a Lora using Musibi and Hunyuan Models.

When using the --fp8_base flags and models I am getting 466s/it

When using the normal (non fp8) models I am getting 200s/it

I am training using an RTX 4070 super 12GB.

I've followed everything here https://github.com/kohya-ss/musubi-tuner to configure it for low VRAM and it seems to run worse than the high VRAM models? It doesn't make any sense to me. Any ideas?

submitted by /u/coolsimon123
[link] [comments]

Vista de Lectura