Flux Kontext Character Turnaround Sheet LoRA
![]() | submitted by /u/sktksm [link] [comments] |
![]() | I created a subject lock-on workflow in ComfyUI, inspired by this post. The idea was to keep the subject fixed at the center of the frame. At that time, I achieved it by cropping the video to zoom in on the subject. This time, I tried the opposite approach: when the camera follows the subject and part of it goes outside the original frame, I treated the missing area as padding and used Wan2.1 VACE to outpaint it. While the results weren't bad, the process was quite sensitive to the subject's shape, which led to a lot of video shakiness. Some stabilization would likely improve it. In fact, this workflow might be used as a new kind of video stabilization that doesn’t require narrowing the field of view. workflow : Lock-On Stabilization with Wan2.1 VACE outpainting [link] [comments] |
![]() | I wanted to share this new merge I released today that I have been enjoying. Realism Illustrious models are nothing new, but I think this merge achieves a fun balance between realism and the danbooru prompt comprehension of the Illustrious anime models. Sophos Realism v1.0 on CivitAI (Note: The model card features some example images that would violate the rules of this subreddit. You can control what you see on CivitAI, so I figure it's fine to link to it. Just know that this model can do those kinds of images quite well too.) The model card on CivitAI features all the details, including two LoRAs that I can't recommend enough for this model and really for any Illustrious model: dark (dramatic chiaroscuro lighting) and Stabilizer IL/NAI. If you check it out, please let me know what you think of it. This is my first SDXL / Illustrious merge that I felt was worth sharing with the community. [link] [comments] |
![]() | Here is the github link, you don't need to install any dependencies: https://github.com/SanicsP/ComfyUI-CsvUtils [link] [comments] |
![]() | 20 years ago wanted to build a game, realized I had to learn 3d modelling with 3d Max / Blender, which I tried and gave up after a few months. Over the weekend I dug up some game design files on my old desktop and realized we could just generate 3d models with prompts in 2025 (what a time to be alive). So far, I've been surprised by how good the capabilities of text to image and then image to 3D models already are. Wouldn't say it's 100% there but we're getting closer every few months, and new service platforms are improving with generally positive user feedback. Lastly, I've got zero experience in 3d rendering so i'm just naively using defaults settings everywhere, so here's just me doing side by side comparison of things I've tried. I'm evaluating these two projects: 3DAIStudio and open source model Tripo SR The prompt i'm evaluating is given below (~1000 characters)
Here are the output comparisons First we generate an image with text to image with stable diffusion Tripo output looks really good. some facial deformity (is that the right term?) otherwise it's solid. To separate the comparison, I reran the text to image prompt with openai gpt-image-1 Both were generated with model and config defaults. I will retopo and fix the textures next but this is a really good start that I most likely will import into Blender. Overall I like the 3dAIStudio a tad more due to better facial construction. Since I have quite few credits left on both I'll keep testing and report back. [link] [comments] |
![]() | submitted by /u/Z3ROCOOL22 [link] [comments] |
When Nvidia's 5000 series released, there were a lot of problems and most of the tools weren't optimised for the new architecture.
I am running a 3090 and casually explore local AI like like image and video generations. It does work, and while image generations have acceptable speeds, some 960p WAN videos take up to 1,2 hours to generate. Meaning, I can't use my PC and it's very rarely that I get what I want from the first try
As the prices of 5090 start to normalize in my region, I am becoming more open to invest in a better GPU. The question is, how much is the real world performance gain and do current tools use the fp8 acceleration?
I am slowly coming up on my goal of being able to afford the absolute cheapest Nvidia RTX 5090 within reach (MSI Ventus) and I'd like to know from other 5090 owners whether they ditched all their ggufs, fp8's, nf4's, Q4's and turbo loras the minute they installed their new 32Gb cards, only keeping or downloading anew the full size models, or if there is still a place for the smaller VRAM utilizing models despite having a 5090 card?
![]() | I was looking online on the best face swap ai around in comfyui, I stumbled upon InstantID & ReActor as the best 2 for now. I was comparing between both. InstantID is better quality, more flexible results. It excels at preserving a person's identity while adapting it to various styles and poses, even from a single reference image. This makes it a powerful tool for creating stylized portraits and artistic interpretations. While InstantID's results are often superior, the likeness to the source is not always perfect. ReActor on the other hand is highly effective for photorealistic face swapping. It can produce realistic results when swapping a face onto a target image or video, maintaining natural expressions and lighting. However, its performance can be limited with varied angles and it may produce pixelation artifacts. It also struggles with non-photorealistic styles, such as cartoons. And some here noted that ReActor can produce images with a low resolution of 128x128 pixels, which may require upscaling tools that can sometimes result in a loss of skin texture. So the obvious route would've been InstantID, until I stumbled on someone who said he used both together as you can see here. Which is really great idea that handles both weaknesses. But my question is, is it still functional? The workflow is 1 year old. I know that ReActor is discontinued but Instant ID on the other hand isn't. Can someone try this and confirm? [link] [comments] |
why more and more checkpoints/models/loras releases are based on sdxl or sd1.5 instead of sd3, is it just because of low vram or something missing in sd3.
Hello!
Has anyone had any luck getting a handheld camera motion out of WAN? All I got so far was Dollys, Pans and Zooms but there seems to be no way to create video from a more dynamic/shaky camera yet.. Seems like something that could be archieved with a Lora?
![]() | submitted by /u/CeFurkan [link] [comments] |
![]() | Hey everyone, I've been working on a Gradio-based frontend for ComfyUI that focuses on consistent character generation. It's not revolutionary by any means, but an interesting experience for me. It's built around ComfyScript, in a limbo between pure python and ComfyUI API format, which means that while the workflow that one gets is fully usable in ComfyUI it is very messy. The application includes the following features:
I initially built it for helping generate datasets for custom characters. While this can be achieved by prompting, there is usually an inherent bias with models. For examples, it's difficult to produce produce dark skinned people with red hair, or get a specific facial structure or skin culture in combination with a specific ethnicity. This was a way to solve that issue by iteratively inpainting different parts to get a unique character. So far, it's worked pretty well for me, and so I thought to showcase my work. It's very opinionated, and is built around the way I work, but that doesn't mean it has to stay that way. If anyone has any suggestions or ideas for features, please let me know, either here or by opening an issue or pull request. Here's a imgur album of some images. Most are from the repository, but there are two additional examples: https://imgur.com/a/NZU8LEP [link] [comments] |
Hello guys. Currently I have 3090 with 24 VRAM + 32 GB RAM. Since DDR4 memory hit its end of cycle of production i need to make decision now. I work mainly with flux, WAN and Vace. Could expanding my RAM to 64GB make any difference in generation time? Or I simply don't need more than 32 GB with 24 GB VRAM? Thx for your inputs in advance.
I know ads are bad, but this is almost a public service to the community
I will help make that simple workflow to satisfy your needs, I will cut that fat
I am open to all kinds of things
How much?
It is illegal to buy crypto with the money in my bank account in my country
So, I will take anything that you can offer. I may do it for free if you ask nicely
Payment method?
PayPal
Let me help you
DM me now before it too late
Using VENV: C:\SD.next\sdnext\venv
22:03:13-972163 INFO Starting SD.Next
22:03:13-986475 INFO Logger: file="C:\SD.next\sdnext\sdnext.log" level=INFO host="LAPTOP-T2GEUGHV" size=127006
mode=append
22:03:13-988474 INFO Python: version=3.10.6 platform=Windows bin="C:\SD.next\sdnext\venv\Scripts\python.exe"
venv="C:\SD.next\sdnext\venv"
22:03:14-195598 INFO Version: app=sd.next updated=2025-07-06 hash=d5d857aa branch=master
url=https://github.com/vladmandic/sdnext/tree/master ui=main
22:03:14-685663 INFO Version: app=sd.next latest=2025-07-06T00:17:54Z hash=d5d857aa branch=master
22:03:14-696808 INFO Platform: arch=AMD64 cpu=AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD system=Windows
release=Windows-10-10.0.26100-SP0 python=3.10.6 locale=('English_Malaysia', '1252')
docker=False
22:03:14-700326 INFO Args: []
22:03:14-710840 INFO ROCm: AMD toolkit detected
22:03:14-747216 WARNING ROCm: no agent was found
22:03:14-747216 INFO ROCm: version=6.2
22:03:14-749813 WARNING Failed to load ZLUDA: Could not find module
'C:\SD.next\ZLUDA-nightly-windows-rocm6-amd64\nvcuda.dll\nvcuda.dll' (or one of its
dependencies). Try using the full path with constructor syntax.
22:03:14-750823 INFO Using CPU-only torch
22:03:14-751857 INFO ROCm: HSA_OVERRIDE_GFX_VERSION auto config skipped: device=None version=None
22:03:14-840100 WARNING Modified files: ['webui.bat']
22:03:14-916709 INFO Install: verifying requirements
22:03:14-975612 INFO Extensions: disabled=[]
22:03:14-976628 INFO Extensions: path="extensions-builtin" enabled=['Lora', 'sd-extension-chainner',
'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui',
'stable-diffusion-webui-rembg']
22:03:14-982038 INFO Extensions: path="extensions" enabled=[]
22:03:14-983043 INFO Startup: quick launch
22:03:14-985188 INFO Extensions: disabled=[]
22:03:14-986191 INFO Extensions: path="extensions-builtin" enabled=['Lora', 'sd-extension-chainner',
'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui',
'stable-diffusion-webui-rembg']
22:03:14-990187 INFO Extensions: path="extensions" enabled=[]
22:03:14-995283 INFO Installer time: total=1.78 latest=0.70 base=0.28 version=0.20 git=0.17 files=0.09
requirements=0.08 log=0.08 installed=0.08 torch=0.05
22:03:14-997330 INFO Command line args: [] args=[]
22:03:22-627821 INFO Torch: torch==2.7.1+cpu torchvision==0.22.1+cpu
22:03:22-629821 INFO Packages: diffusers==0.35.0.dev0 transformers==4.53.0 accelerate==1.8.1 gradio==3.43.2
pydantic==1.10.21
22:03:23-331756 INFO Engine: backend=Backend.DIFFUSERS compute=cpu device=cpu attention="Scaled-Dot-Product"
mode=no_grad
22:03:23-336881 INFO Torch parameters: backend=cpu device=cpu config=Auto dtype=torch.float32 context=no_grad
nohalf=False nohalfvae=False upcast=False deterministic=False tunable=[False, False] fp16=fail
bf16=fail optimization="Scaled-Dot-Product"
22:03:23-338880 INFO Device:
22:03:23-609726 INFO Available VAEs: path="models\VAE" items=0
22:03:23-611726 INFO Available UNets: path="models\UNET" items=0
22:03:23-612730 INFO Available TEs: path="models\Text-encoder" items=0
22:03:23-615391 INFO Available Models: safetensors="models\Stable-diffusion":2 diffusers="models\Diffusers":0
items=2 time=0.00
22:03:23-626224 INFO Available LoRAs: path="models\Lora" items=0 folders=2 time=0.00
22:03:23-645701 INFO Available Styles: path="models\styles" items=288 time=0.02
22:03:23-726925 INFO Available Detailer: path="models\yolo" items=10 downloaded=0
22:03:23-728936 INFO Load extensions
22:03:24-730797 INFO Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py' Using
sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
22:03:24-750484 INFO Available Upscalers: items=72 downloaded=0 user=0 time=0.01 types=['None', 'Resize', 'Latent',
'AsymmetricVAE', 'DCC', 'VIPS', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'RealESRGAN', 'SCUNet',
'Diffusion', 'SwinIR']
22:03:24-757459 INFO UI locale: name="Auto"
22:03:24-758749 INFO UI theme: type=Standard name="black-teal" available=13
22:03:26-918871 INFO Extension list is empty: refresh required
22:03:28-309571 INFO Local URL: http://127.0.0.1:7860/
22:03:28-530142 INFO [AgentScheduler] Task queue is empty
22:03:28-531141 INFO [AgentScheduler] Registering APIs
22:03:29-018353 INFO Selecting first available checkpoint
22:03:29-020355 INFO Startup time: total=18.19 torch=7.49 launch=1.60 ui-extensions=1.59 installer=1.39 libraries=1.12 gradio=1.02 extensions=1.01
app-started=0.58 ui-networks=0.32 ui-control=0.31 ui-txt2img=0.30 ui-video=0.27 ui-img2img=0.18 transformers=0.15 ui-defaults=0.13
ui-models=0.13 api=0.12 diffusers=0.11 detailer=0.08 onnx=0.05
22:05:29-028702 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=126 elapsed=120.01 eta=None progress=0
22:07:29-875010 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=247 elapsed=240.86 eta=None progress=0
22:09:30-741802 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=368 elapsed=361.73 eta=None progress=0
22:11:31-620733 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=489 elapsed=482.6 eta=None progress=0
22:13:32-612584 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=610 elapsed=603.6 eta=None progress=0
22:15:32-639752 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=730 elapsed=723.62 eta=None progress=0
22:17:33-539797 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=850 elapsed=844.52 eta=None progress=0
22:19:34-533158 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=971 elapsed=965.52 eta=None progress=0
22:21:35-519983 TRACE Server: alive=True requests=1 memory=0.64/15.34 status='idle' task='' timestamp=None current='' id='d518b2af6076494' job=0 jobs=0
total=1 step=0 steps=0 queued=0 uptime=1092 elapsed=1086.5 eta=None progress=0
'What am I missing here?'
Hello! I'm trying to train my own lora on my computer. The training completes without errors, but I can't trigger lora. I don't think the problem is in the dataset. Lora created with the same dataset on civitai works fine. Please tell me what I'm doing wrong.
Link to lora: https://drive.google.com/file/d/11detwAzPjsaoVkbifEGuwZXXmEhSAdHY/view?usp=drive_link
Link to kohya_ss configuration: https://drive.google.com/file/d/1wIOKVzooiEXrwnrF5d4U5nlW5XGcJx4r/view?usp=drive_link
LoRA-GGPO (Gradient-Guided Perturbation Optimization), a novel method that leverages gradient and weight norms to generate targeted perturbations. By optimizing the sharpness of the loss landscape, LoRA-GGPO guides the model toward flatter minima, mitigating the double descent problem and improving generalization.