Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
Hoy — 10 Julio 2025StableDiffusion

Flux Kontext and the German language

10 Julio 2025 at 09:33
Flux Kontext and the German language

Just wanted to create a quick image. It's a German book for children, and it's kind of a meme in Germany to change the cover to something else. I wanted to see how Flux Kontext does the job.

The image itself aside, the text should be "Conni wird verklagt" (Conni gets sued), and it just doesn't get it right. I tried the English version in between, and it got it right on the first try.

I tried it with a basic Flux Kontext workflow, either with Nunchaku or with the Q6_K.gguf.

The prompt:

Change the text "Conni macht Musik" to the German text "Conni wird verklagt"

Change the scene so that the girl is sitting on a defendant's bench, wearing an orange jumpsuit and handcuffs. Change her facial expression to sad.

Maintain the style of the comic and maintain the facial features of the girl.

Anyone else noticed issue with non-English text?

submitted by /u/Feroc
[link] [comments]

SingLoRA: a single-matrix LoRA variant halves parameters and boosts DreamBooth fidelity – when will we see it in ComfyUI?

10 Julio 2025 at 13:48

I stepped upon this new arXiv preprint: “SingLoRA: Low-Rank Adaptation Using a Single Matrix.” It proposes a twist on standard LoRA by dropping the two-matrix approach (B A) in favor of a single matrix A applied as A Aᵀ. Some highlights:

  • ~50 % smaller adapters – one matrix instead of two, so half the trainable parameters and lighter .safetensors files
  • Only one learning rate
  • Higher accuracy with fewer params.
  • DreamBooth on SD 1.5 improved DINO-similarity by 5.4 % at the same rank

If I get this right it could be adopted in inference pipelines, meaning:

  • Smaller LoRA checkpoints
  • Less VRAM usage when merging
  • Simpler and more stable training with fewer hyper-params to tweak

Now the question we always ask: Comfy when?

Link to paper: https://arxiv.org/abs/2507.05566

submitted by /u/marcoc2
[link] [comments]

Introducing a new Lora Loader node which stores your trigger keywords and applies them to your prompt automatically

9 Julio 2025 at 15:39
Introducing a new Lora Loader node which stores your trigger keywords and applies them to your prompt automatically

The addresses an issue that I know many people complain about with ComfyUI. It introduces a LoRa loader that automatically switches out trigger keywords when you change LoRa's. It saves triggers in ${comfy}/models/loras/triggers.json but the load and save of triggers can be accomplished entirely via the node. Just make sure to upload the json file if you use it on runpod.

https://github.com/benstaniford/comfy-lora-loader-with-triggerdb

The examples above show how you can use this in conjunction with a prompt building node like CR Combine Prompt in order to have prompts automatically rebuilt as you switch LoRas.

Hope you have fun with it, let me know on the github page if you encounter any issues. I'll see if I can get it PR'd into ComfyUIManager's node list but for now, feel free to install it via the "Install Git URL" feature.

submitted by /u/ratttertintattertins
[link] [comments]

Storage of Files and Generation/Load Speed

10 Julio 2025 at 13:09

Hello, I was wondering what everyone was doing for storage. I have a fast M2 drive with limited space that I have the core ComfyUI files and some models saved on it, but most of the larger full checkpoints and loras on an external drive. Anyone have the optimal setup?

submitted by /u/fancy_scarecrow
[link] [comments]

FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation

10 Julio 2025 at 04:49

https://arxiv.org/pdf/2506.18899 https://filmaster-ai.github.io/

I'm not the author nor anyone involved. I just saw this and thought it was pretty cool, and wanted to hear your thoughts on it.

What do you guys think of it? Does it have the potential to surpass veo, runway, Kling, wan, vace?

Quote:

What Makes FilMaster Different?

Built-in Cinematic Expertise We don't just generate video; we apply cinematic principles in camera language design, cinematic rhythm control to create high-quality films, including a rich, dynamic audio landscape.

Fully Automated Production Pipeline From script analysis to final render, FilMaster automates the entire process and delivers project files compatible with professional editing software.

More examples on their website: https://filmaster-ai.github.io/

submitted by /u/Quantum_Crusher
[link] [comments]

New LTXV IC-Lora Tutorial – Quick Video Walkthrough

9 Julio 2025 at 17:17
New LTXV IC-Lora Tutorial – Quick Video Walkthrough

To support the community and help you get the most out of our new Control LoRAs, we’ve created a simple video tutorial showing how to set up and run our IC-LoRA workflow.

We’ll continue sharing more workflows and tips soon 🎉

For community workflows, early access, and technical help — join us on Discord!

Links Links Links:

submitted by /u/ofirbibi
[link] [comments]

I made anime colorization ControlNet Model v2 (SD 1.5)

9 Julio 2025 at 20:33
I made anime colorization ControlNet Model v2 (SD 1.5)

Hey everyone!
I just finished training my second ControlNet model for manga colorization – it takes black-and-white anime pictures and adds colors automatically.

https://preview.redd.it/kqq9mf76swbf1.png?width=884&format=png&auto=webp&s=e36f7eef62e64448a52f5bc780be0471bba5358b

I’ve compiled a new dataset that includes not only manga images, but also fan artworks of nature, cities etc.

Hugging Face model

ComfyUI workflow

I would like you to try it, share your results and leave a review!

https://preview.redd.it/velplaldswbf1.png?width=881&format=png&auto=webp&s=8adb39a5aa9b8677ec4c0c5ee537793c3bff65bc

https://preview.redd.it/ckbx8eaiswbf1.png?width=887&format=png&auto=webp&s=a41fcbb1d765fbdc70d674e0c06914f68ca6a5d7

submitted by /u/InfamousPerformance8
[link] [comments]

FaceFusion DirectML: 6600XT reaching 93-97% utilisation + 7GB memory usage - yet only 7% memory controller utilisation (during Face Swapper processing)?

10 Julio 2025 at 11:02

(i7 14700K + 2x16GB DDR4 RAM)

During FaceFusion's initial 'Face Swapper' stage: HWiNFO reports 67C~ overall (92C~ hot spot) GPU temps + 92-97% core utilisation, alongside 7.1GB VRAM use (with 'tolerant' video memory strategy setting).

Despite evident GPU usage: 'memory controller utilisation' remains locked at 6-8% throughout, followed by the subsequent 'Face Enhancer' stage resulting in erratic fluctuations every second instead (between 2% and 30%~).

I've tried various setting combinations for execution thread count, execution queue count, video memory strategy, and system memory limit (+ maximum priority assigned to FaceFusion via ProcessLasso) with no avail. 14700K stays at 50-60C + 14% total usage regardless of stage, and ‘Physical Memory Available’ behaves similarly with a consistent 20GB (lowest hit @ 15GB, even when 'system memory limit' is set to 28GB).

I've been planning to eventually acquire either a RTX 4070 Super / 5070 for an overdue upgrade - just hasn't been a huge priority with my current chronic League of Legends addiction + only occasional usage of AI in minor tasks like upscaling (though likely due to current speeds when training models 🤕).

Any suggestions for solutions (+ alleviating general AMD AI bottleneck) in the meanwhile?

Thank you in advance!

submitted by /u/Iron_Monkey
[link] [comments]

Stable Diffusion inpainting Model

10 Julio 2025 at 10:42
Stable Diffusion inpainting Model

Need quick advice on generating realistic car shadows!

Working on a pipeline to add realistic shadows to car images (2D photos → same image with shadows). Under time constraints so need the fastest/most reliable approach. Should I go with traditional CV methods (segmentation + physics-based synthesis) or ML-heavy approach (shadow segmentation + Stable Diffusion inpainting)? Any major gotchas with either direction?

Thanks!

https://preview.redd.it/l5l9y1z101cf1.jpg?width=494&format=pjpg&auto=webp&s=08e9e9b358688b69db36268600baab242e989ab4

submitted by /u/Ill-Potential-3739
[link] [comments]

Some wan2.1 text2image results.

9 Julio 2025 at 17:39
Some wan2.1 text2image results.

A candid kitchen-pass portrait of a focused young Korean-American chef plating a vibrant bibimbap bowl under the ivory glow of overhead heat lamps. She sports a black double-breasted chef coat flecked with tiny flour spots, and a colorful tattoo sleeve peeks beneath her rolled-up cuff. Stainless-steel counters, stacked porcelain, and a blur of bustling line cooks create a busy backdrop. The image features tiny steam wisps rising and diffused highlights on her glistening mise en place, captured with a slight handheld tilt for immediacy. The overall lighting and ambience emulate warm tungsten restaurant lighting mixed with cooler prep-station fluorescents, conveying an energetic yet intimate culinary moment.

A heartfelt, spontaneous photograph of an elderly Afro-Caribbean couple slow-dancing on their front porch under strings of vintage Edison bulbs at blue hour, the gentleman wearing a crisp linen guayabera and the lady in a flowing floral sundress. Their foreheads touch ever so gently, eyes closed in nostalgic bliss, while pastel Caribbean houses fade into bokeh behind them. The image features time-worn laugh lines, subtle age spots, and textured gray curls lit by soft, ambient porch light. The overall lighting and ambience feel reminiscent of film photography: warm, nostalgic amber tones with gentle grain and authentic shadow depth, making the scene tender and timeless.

A dimension-bending portrait of a master origami artist whose paper creations appear to animate and interact with their creator, blurring the boundary between art and reality. Delicate paper birds seem caught mid-flight around her contemplative figure as she folds new creations with meditative precision. Natural light through rice paper windows creates translucent effects that enhance the magical atmosphere while illuminating the extraordinary detail of both completed works and those in progress. The image captures the artist's lifetime of dedication in her weathered hands while her creations demonstrate impossible lightness and movement. The composition creates deliberate visual ambiguity about which elements are completed art, which are in progress, and which might be actual birds photographed in motion, challenging the viewer's perception of the creative process itself.

A time-collapsing portrait of three generations of women from the same family superimposed in the same kitchen space, each performing the same cooking tradition at different historical periods. The grandmother , 70 years old is wearing 1950s attire, mother, 40 years old is wearing 1980s fashion, and daughter, 18 years old is wearing modern fashion, occupy the same physical space while the kitchen details shift subtly between eras. The image captures identical genetic expressions and hand gestures passed through generations while showing the evolution of the same physical space. The composition maintains perfect alignment of architectural features while allowing temporal elements to blur and overlap, creating a visual family history that collapses time into a single frame while maintaining authentic period details from each era.

A hyperdynamic capture of an elderly martial arts master demonstrating a perfect spinning kick, his traditional gi creating a circular blur of white fabric against a minimalist dojo background. Despite his age, his body demonstrates extraordinary flexibility and power as wooden practice dummies splinter from the impact. Morning light streams through paper windows in visible beams, highlighting the explosion of wood fragments suspended in air. The image captures authentic aging with respectful detail while emphasizing the lifetime of discipline evident in his perfectly balanced form. The composition freezes the apex of rotation with the master's face in sharp focus amid the motion blur, creating a study of human mastery that transcends age.

A meticulously composed fine art photograph of a solitary figure draped in flowing white fabric standing in an abandoned marble quarry at dawn, their silhouette creating dramatic negative space against the geometric cuts in the stone. Soft morning mist drifts through the scene, catching the first rays of sunlight that filter through the industrial landscape. The fabric billows and twists in the gentle breeze, creating organic shapes that contrast with the harsh angular environment. The image captures ethereal movement frozen in time, with delicate gradations from deep shadows to luminous highlights, shot on medium format film for exceptional tonal range and subtle grain structure that adds to the dreamlike quality.

A stark black and white high contrast photograph of a dancer mid-leap against a pure white cyclorama, their muscular form creating bold geometric shapes with arms extended and legs bent at sharp angles. Deep, inky shadows carve out the definition of every muscle and tendon, while brilliant highlights emphasize the sheen of perspiration on their skin. The lighting setup uses harsh directional strobes from opposing angles, eliminating all mid-tones to create a graphic, almost abstract composition. The image features razor-sharp focus throughout, capturing every detail from the texture of their athletic wear to individual strands of hair frozen in motion, resulting in a powerful study of human form reduced to its essential elements.

https://preview.redd.it/ywdzvhtbwvbf1.png?width=1920&format=png&auto=webp&s=fca0f56ca77302fbcbd958a67b785e783888f36d

An electrifying concert capturing a rock guitarist mid-solo at the climax of their performance, sweat glistening under the stage lights as they bend backward in an impossible arch, hair whipping through beams of colored light. The crowd below reaches upward in a sea of raised hands, their faces illuminated by phone screens and stage effects. Smoke machines and laser lights create layers of atmosphere while maintaining sharp focus on the performer's intense expression. The image freezes a moment of pure energy, shot at high ISO to maintain fast shutter speed, with grain that adds to the raw, visceral feeling of live music.

https://preview.redd.it/7em1cmsbwvbf1.png?width=1920&format=png&auto=webp&s=157688fee33739031a55de2f6131fe792195984b

An avant-garde multiple exposure photograph combining a dancer's movement with projections of city lights, creating a human form that appears to be made of pure energy and urban landscapes. The technique layers dozens of exposures in-camera, with the subject moving through choreographed positions while colored lights and architectural projections paint patterns across their body. The final image shows a ghostly figure whose boundaries dissolve into streams of light and shadow, suggesting the intersection of human movement and urban rhythm. The color palette shifts from cool blues and purples in the shadows to warm oranges and yellows in the highlights, creating a visual symphony of motion and light.

I used the same workflow shared by @yanokusnir on his post- https://www.reddit.com/r/StableDiffusion/comments/1lu7nxx/wan_21_txt2img_is_amazing/ .

submitted by /u/Devajyoti1231
[link] [comments]

Photo like this.

10 Julio 2025 at 10:14

I wanted to recreate photo like this on my own with my friends. I already have the background of this photo and our own that we would like to merge with to create similar effect. I wanted to use Chat-GPT for that purpose but everytime I want it to generate merged photo it changes our clothes and logos and even gives my friend a helmet because of him wearing a balaclava also turning it into a cartoonish like look. My question is is there an A.I that could do something like this without all those secondary effects. I know for a fact that Chat GPT doesn't edit these photos but only generates them according to my commands so is there an actual A.I that edits photos by command and could pull this off? And if not, is there one A.I that could make more realistic and naturally looking already edited photo that we tried to do ourselves? I'm talking about correcting shadows, figures placement e.t.c.

submitted by /u/Queasy-Baseball313
[link] [comments]

Made game art with SD and now I feel like I'm cheating

10 Julio 2025 at 13:07

Okay so I've been messing around with Stable Diffusion for a few weeks now and I accidentally created some really good game assets.

Started just for fun, trying to generate random fantasy creatures. But the output was actually usable? Like, with some cleanup and editing, these could totally work in an actual game.

I'm working on this little indie project (roguelike, naturally) and I was dreading the art phase because I can barely draw stick figures. But SD is generating concept art faster than I can evaluate it.

The weird part is I feel guilty about it. Like I'm somehow cheating by not spending months learning to draw properly. But then I remember that I'm a solo dev and if this tool helps me actually finish my game instead of getting stuck on art for years...

Saw that Ocean Keeper used some AI-assisted art in their development process and it got me thinking about where the line is. If you use AI for initial concepts but then hand-draw the final versions, is that different from using photos for reference?

The art purists are gonna hate this but honestly, SD is democratizing game development in a way that feels revolutionary. Small teams can now create visual assets that would have required a full art department before.

submitted by /u/000nana
[link] [comments]
❌
❌