Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
Ayer — 7 Julio 2025StableDiffusion

Lock-On Stabilization with Wan2.1 VACE outpainting

7 Julio 2025 at 06:45
Lock-On Stabilization with Wan2.1 VACE outpainting

I created a subject lock-on workflow in ComfyUI, inspired by this post.

The idea was to keep the subject fixed at the center of the frame. At that time, I achieved it by cropping the video to zoom in on the subject.

This time, I tried the opposite approach: when the camera follows the subject and part of it goes outside the original frame, I treated the missing area as padding and used Wan2.1 VACE to outpaint it.

While the results weren't bad, the process was quite sensitive to the subject's shape, which led to a lot of video shakiness. Some stabilization would likely improve it.

In fact, this workflow might be used as a new kind of video stabilization that doesn’t require narrowing the field of view.

workflow : Lock-On Stabilization with Wan2.1 VACE outpainting

submitted by /u/nomadoor
[link] [comments]

New Illustrious Model: Sophos Realism

7 Julio 2025 at 01:51
New Illustrious Model: Sophos Realism

I wanted to share this new merge I released today that I have been enjoying. Realism Illustrious models are nothing new, but I think this merge achieves a fun balance between realism and the danbooru prompt comprehension of the Illustrious anime models.

Sophos Realism v1.0 on CivitAI

(Note: The model card features some example images that would violate the rules of this subreddit. You can control what you see on CivitAI, so I figure it's fine to link to it. Just know that this model can do those kinds of images quite well too.)

The model card on CivitAI features all the details, including two LoRAs that I can't recommend enough for this model and really for any Illustrious model: dark (dramatic chiaroscuro lighting) and Stabilizer IL/NAI.

If you check it out, please let me know what you think of it. This is my first SDXL / Illustrious merge that I felt was worth sharing with the community.

submitted by /u/sophosympatheia
[link] [comments]

Is AI text to 3d model services usable?

7 Julio 2025 at 13:42
Is AI text to 3d model services usable?

20 years ago wanted to build a game, realized I had to learn 3d modelling with 3d Max / Blender, which I tried and gave up after a few months.

Over the weekend I dug up some game design files on my old desktop and realized we could just generate 3d models with prompts in 2025 (what a time to be alive). So far, I've been surprised by how good the capabilities of text to image and then image to 3D models already are.

Wouldn't say it's 100% there but we're getting closer every few months, and new service platforms are improving with generally positive user feedback. Lastly, I've got zero experience in 3d rendering so i'm just naively using defaults settings everywhere, so here's just me doing side by side comparison of things I've tried.

I'm evaluating these two projects: 3DAIStudio and open source model Tripo SR

The prompt i'm evaluating is given below (~1000 characters)

A detailed 3D model of a female cyberpunk netrunner (cybernetic hacker), athletic and lean, with sharp features and glowing neon-blue cybernetic eyes—one covered by a sleek AR visor. Her hair is asymmetrical: half-shaved, with long, vibrant strands in purple and teal. She wears a tactical black bodysuit with hex patterns and glowing magenta/cyan circuit lines, layered with a cropped jacket featuring digital code motifs. Visible cybernetic implants run along her spine and forearms, with glowing nodes and fiber optics. A compact cyberdeck is strapped to her back; one gloved hand projects a holographic UI. Accessories include utility belts, an EMP grenade, and a smart pistol. She stands confidently on a rainy rooftop at night, neon-lit cityscape behind her, steam rising from vents. Neon reflections dance on wet surfaces. Mood is edgy, futuristic, and rebellious, with dramatic side lighting and high contrast.

Here are the output comparisons

First we generate an image with text to image with stable diffusion

https://preview.redd.it/hrfkav62cgbf1.jpg?width=1024&format=pjpg&auto=webp&s=a9df349743a8bb3fb1bdf5700edae84129240243

Tripo output looks really good. some facial deformity (is that the right term?) otherwise it's solid.

https://preview.redd.it/imywe82ydgbf1.png?width=1138&format=png&auto=webp&s=0f908e85be82857246e6442a2866f72463988c61

https://preview.redd.it/ld3wgx1ydgbf1.png?width=1324&format=png&auto=webp&s=c28feb425fec8c598d9f892c8974c5e715e4f639

Removing the texture

To separate the comparison, I reran the text to image prompt with openai gpt-image-1

https://preview.redd.it/pkj4de97egbf1.png?width=1024&format=png&auto=webp&s=6e0538b3a98893d9ecf85a0757390eedabad14ac

https://preview.redd.it/fimfjs93ggbf1.png?width=1170&format=png&auto=webp&s=a4ea2c4348db5b1fe3dc4162c1aa00aa0a0aa8dc

https://preview.redd.it/3apji0q2kgbf1.png?width=666&format=png&auto=webp&s=2e39a1d14f2e92a54552d884d300c7cf7b978891

Both were generated with model and config defaults. I will retopo and fix the textures next but this is a really good start that I most likely will import into Blender. Overall I like the 3dAIStudio a tad more due to better facial construction. Since I have quite few credits left on both I'll keep testing and report back.

submitted by /u/Conscious_Tension811
[link] [comments]

Worth upgrading from 3090 to 5090 for local image and video generations

7 Julio 2025 at 13:50

When Nvidia's 5000 series released, there were a lot of problems and most of the tools weren't optimised for the new architecture.

I am running a 3090 and casually explore local AI like like image and video generations. It does work, and while image generations have acceptable speeds, some 960p WAN videos take up to 1,2 hours to generate. Meaning, I can't use my PC and it's very rarely that I get what I want from the first try

As the prices of 5090 start to normalize in my region, I am becoming more open to invest in a better GPU. The question is, how much is the real world performance gain and do current tools use the fp8 acceleration?

submitted by /u/cruel_frames
[link] [comments]

A question for the RTX 5090 owners

7 Julio 2025 at 13:22

I am slowly coming up on my goal of being able to afford the absolute cheapest Nvidia RTX 5090 within reach (MSI Ventus) and I'd like to know from other 5090 owners whether they ditched all their ggufs, fp8's, nf4's, Q4's and turbo loras the minute they installed their new 32Gb cards, only keeping or downloading anew the full size models, or if there is still a place for the smaller VRAM utilizing models despite having a 5090 card?

submitted by /u/RadiantPen8536
[link] [comments]

Using InstantID with ReActor ai for faceswap

6 Julio 2025 at 18:00
Using InstantID with ReActor ai for faceswap

I was looking online on the best face swap ai around in comfyui, I stumbled upon InstantID & ReActor as the best 2 for now. I was comparing between both.

InstantID is better quality, more flexible results. It excels at preserving a person's identity while adapting it to various styles and poses, even from a single reference image. This makes it a powerful tool for creating stylized portraits and artistic interpretations. While InstantID's results are often superior, the likeness to the source is not always perfect.

ReActor on the other hand is highly effective for photorealistic face swapping. It can produce realistic results when swapping a face onto a target image or video, maintaining natural expressions and lighting. However, its performance can be limited with varied angles and it may produce pixelation artifacts. It also struggles with non-photorealistic styles, such as cartoons. And some here noted that ReActor can produce images with a low resolution of 128x128 pixels, which may require upscaling tools that can sometimes result in a loss of skin texture.

So the obvious route would've been InstantID, until I stumbled on someone who said he used both together as you can see here.

Which is really great idea that handles both weaknesses. But my question is, is it still functional? The workflow is 1 year old. I know that ReActor is discontinued but Instant ID on the other hand isn't. Can someone try this and confirm?

submitted by /u/Star-Light-9698
[link] [comments]

WAN Handheld Camera motion?

7 Julio 2025 at 05:51

Hello!
Has anyone had any luck getting a handheld camera motion out of WAN? All I got so far was Dollys, Pans and Zooms but there seems to be no way to create video from a more dynamic/shaky camera yet.. Seems like something that could be archieved with a Lora?

submitted by /u/Draufgaenger
[link] [comments]

Character Generation Workflow App for ComfyUI

6 Julio 2025 at 21:39
Character Generation Workflow App for ComfyUI

Hey everyone,

I've been working on a Gradio-based frontend for ComfyUI that focuses on consistent character generation. It's not revolutionary by any means, but an interesting experience for me. It's built around ComfyScript, in a limbo between pure python and ComfyUI API format, which means that while the workflow that one gets is fully usable in ComfyUI it is very messy.

The application includes the following features:

  • Step-by-step detail enhancement (face, skin, hair, eyes)
  • Iterative latent and final image upscaling
  • Optional inpainting of existing images
  • Florence2 captioning for quick prompt generation
  • A built-in Character Manager for editing and previewing your character list

I initially built it for helping generate datasets for custom characters. While this can be achieved by prompting, there is usually an inherent bias with models. For examples, it's difficult to produce produce dark skinned people with red hair, or get a specific facial structure or skin culture in combination with a specific ethnicity. This was a way to solve that issue by iteratively inpainting different parts to get a unique character.

So far, it's worked pretty well for me, and so I thought to showcase my work. It's very opinionated, and is built around the way I work, but that doesn't mean it has to stay that way. If anyone has any suggestions or ideas for features, please let me know, either here or by opening an issue or pull request.

Here's a imgur album of some images. Most are from the repository, but there are two additional examples: https://imgur.com/a/NZU8LEP

submitted by /u/ScarTarg
[link] [comments]

Does expanding to 64 GB RAM makes sense?

6 Julio 2025 at 17:16

Hello guys. Currently I have 3090 with 24 VRAM + 32 GB RAM. Since DDR4 memory hit its end of cycle of production i need to make decision now. I work mainly with flux, WAN and Vace. Could expanding my RAM to 64GB make any difference in generation time? Or I simply don't need more than 32 GB with 24 GB VRAM? Thx for your inputs in advance.

submitted by /u/Zephyryhpez
[link] [comments]

Workflow help desk - paid service

7 Julio 2025 at 14:33

I know ads are bad, but this is almost a public service to the community

I will help make that simple workflow to satisfy your needs, I will cut that fat

I am open to all kinds of things

How much?

It is illegal to buy crypto with the money in my bank account in my country

So, I will take anything that you can offer. I may do it for free if you ask nicely

Payment method?

PayPal

Let me help you

DM me now before it too late

submitted by /u/Won3wan32
[link] [comments]

I've been trying to get the SD.next UI to run but nothing happens. Am I missing anything? The ZLUDA is in the files but it says it can't find it.

7 Julio 2025 at 14:24

Using VENV: C:\SD.next\sdnext\venv

22:03:13-972163 INFO Starting SD.Next

22:03:13-986475 INFO Logger: file="C:\SD.next\sdnext\sdnext.log" level=INFO host="LAPTOP-T2GEUGHV" size=127006

mode=append

22:03:13-988474 INFO Python: version=3.10.6 platform=Windows bin="C:\SD.next\sdnext\venv\Scripts\python.exe"

venv="C:\SD.next\sdnext\venv"

22:03:14-195598 INFO Version: app=sd.next updated=2025-07-06 hash=d5d857aa branch=master

url=https://github.com/vladmandic/sdnext/tree/master ui=main

22:03:14-685663 INFO Version: app=sd.next latest=2025-07-06T00:17:54Z hash=d5d857aa branch=master

22:03:14-696808 INFO Platform: arch=AMD64 cpu=AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD system=Windows

release=Windows-10-10.0.26100-SP0 python=3.10.6 locale=('English_Malaysia', '1252')

docker=False

22:03:14-700326 INFO Args: []

22:03:14-710840 INFO ROCm: AMD toolkit detected

22:03:14-747216 WARNING ROCm: no agent was found

22:03:14-747216 INFO ROCm: version=6.2

22:03:14-749813 WARNING Failed to load ZLUDA: Could not find module

'C:\SD.next\ZLUDA-nightly-windows-rocm6-amd64\nvcuda.dll\nvcuda.dll' (or one of its

dependencies). Try using the full path with constructor syntax.

22:03:14-750823 INFO Using CPU-only torch

22:03:14-751857 INFO ROCm: HSA_OVERRIDE_GFX_VERSION auto config skipped: device=None version=None

22:03:14-840100 WARNING Modified files: ['webui.bat']

22:03:14-916709 INFO Install: verifying requirements

22:03:14-975612 INFO Extensions: disabled=[]

22:03:14-976628 INFO Extensions: path="extensions-builtin" enabled=['Lora', 'sd-extension-chainner',

'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui',

'stable-diffusion-webui-rembg']

22:03:14-982038 INFO Extensions: path="extensions" enabled=[]

22:03:14-983043 INFO Startup: quick launch

22:03:14-985188 INFO Extensions: disabled=[]

22:03:14-986191 INFO Extensions: path="extensions-builtin" enabled=['Lora', 'sd-extension-chainner',

'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui',

'stable-diffusion-webui-rembg']

22:03:14-990187 INFO Extensions: path="extensions" enabled=[]

22:03:14-995283 INFO Installer time: total=1.78 latest=0.70 base=0.28 version=0.20 git=0.17 files=0.09

requirements=0.08 log=0.08 installed=0.08 torch=0.05

22:03:14-997330 INFO Command line args: [] args=[]

22:03:22-627821 INFO Torch: torch==2.7.1+cpu torchvision==0.22.1+cpu

22:03:22-629821 INFO Packages: diffusers==0.35.0.dev0 transformers==4.53.0 accelerate==1.8.1 gradio==3.43.2

pydantic==1.10.21

22:03:23-331756 INFO Engine: backend=Backend.DIFFUSERS compute=cpu device=cpu attention="Scaled-Dot-Product"

mode=no_grad

22:03:23-336881 INFO Torch parameters: backend=cpu device=cpu config=Auto dtype=torch.float32 context=no_grad

nohalf=False nohalfvae=False upcast=False deterministic=False tunable=[False, False] fp16=fail

bf16=fail optimization="Scaled-Dot-Product"

22:03:23-338880 INFO Device:

22:03:23-609726 INFO Available VAEs: path="models\VAE" items=0

22:03:23-611726 INFO Available UNets: path="models\UNET" items=0

22:03:23-612730 INFO Available TEs: path="models\Text-encoder" items=0

22:03:23-615391 INFO Available Models: safetensors="models\Stable-diffusion":2 diffusers="models\Diffusers":0

items=2 time=0.00

22:03:23-626224 INFO Available LoRAs: path="models\Lora" items=0 folders=2 time=0.00

22:03:23-645701 INFO Available Styles: path="models\styles" items=288 time=0.02

22:03:23-726925 INFO Available Detailer: path="models\yolo" items=10 downloaded=0

22:03:23-728936 INFO Load extensions

22:03:24-730797 INFO Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using

sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3

22:03:24-750484 INFO Available Upscalers: items=72 downloaded=0 user=0 time=0.01 types=['None', 'Resize', 'Latent',

'AsymmetricVAE', 'DCC', 'VIPS', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'RealESRGAN', 'SCUNet',

'Diffusion', 'SwinIR']

22:03:24-757459 INFO UI locale: name="Auto"

22:03:24-758749 INFO UI theme: type=Standard name="black-teal" available=13

22:03:26-918871 INFO Extension list is empty: refresh required

22:03:28-309571 INFO Local URL: http://127.0.0.1:7860/

22:03:28-530142 INFO [AgentScheduler] Task queue is empty

22:03:28-531141 INFO [AgentScheduler] Registering APIs

22:03:29-018353 INFO Selecting first available checkpoint

22:03:29-020355 INFO Startup time: total=18.19 torch=7.49 launch=1.60 ui-extensions=1.59 installer=1.39 libraries=1.12 gradio=1.02 extensions=1.01

app-started=0.58 ui-networks=0.32 ui-control=0.31 ui-txt2img=0.30 ui-video=0.27 ui-img2img=0.18 transformers=0.15 ui-defaults=0.13

ui-models=0.13 api=0.12 diffusers=0.11 detailer=0.08 onnx=0.05

22:05:29-028702 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=126 elapsed=120.01 eta=None progress=0

22:07:29-875010 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=247 elapsed=240.86 eta=None progress=0

22:09:30-741802 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=368 elapsed=361.73 eta=None progress=0

22:11:31-620733 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=489 elapsed=482.6 eta=None progress=0

22:13:32-612584 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=610 elapsed=603.6 eta=None progress=0

22:15:32-639752 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=730 elapsed=723.62 eta=None progress=0

22:17:33-539797 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=850 elapsed=844.52 eta=None progress=0

22:19:34-533158 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=971 elapsed=965.52 eta=None progress=0

22:21:35-519983 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0

total=1 step=0 steps=0 queued=0 uptime=1092 elapsed=1086.5 eta=None progress=0

'What am I missing here?'

submitted by /u/RookChan
[link] [comments]

I can't trigger lora

7 Julio 2025 at 14:07

Hello! I'm trying to train my own lora on my computer. The training completes without errors, but I can't trigger lora. I don't think the problem is in the dataset. Lora created with the same dataset on civitai works fine. Please tell me what I'm doing wrong.

Link to lora: https://drive.google.com/file/d/11detwAzPjsaoVkbifEGuwZXXmEhSAdHY/view?usp=drive_link

Link to kohya_ss configuration: https://drive.google.com/file/d/1wIOKVzooiEXrwnrF5d4U5nlW5XGcJx4r/view?usp=drive_link

submitted by /u/fjay69
[link] [comments]

Kohya - Lora GGPO ? Has anyone tested this configuration ?

7 Julio 2025 at 13:46

LoRA-GGPO (Gradient-Guided Perturbation Optimization), a novel method that leverages gradient and weight norms to generate targeted perturbations. By optimizing the sharpness of the loss landscape, LoRA-GGPO guides the model toward flatter minima, mitigating the double descent problem and improving generalization.

submitted by /u/More_Bid_2197
[link] [comments]
❌
❌