Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
Hoy — 4 Abril 2025StableDiffusion

Long consistent Ai Anime is almost here. Wan 2.1 with LoRa. Generated in 720p on 4090

4 Abril 2025 at 07:53
Long consistent Ai Anime is almost here. Wan 2.1 with LoRa. Generated in 720p on 4090

I was testing Wan and made a short anime scene with consistent characters. I used img2video with last frame to continue and create long videos. I managed to make up to 30 seconds clips this way.

some time ago i made anime with hunyuan t2v, and quality wise i find it better than Wan (wan has more morphing and artifacts) but hunyuan t2v is obviously worse in terms of control and complex interactions between characters. Some footage i took from this old video (during future flashes) but rest is all WAN 2.1 I2V with trained LoRA. I took same character from Hunyuan anime Opening and used with wan. Editing in Premiere pro and audio is also ai gen, i used https://www.openai.fm/ for ORACLE voice and local-llasa-tts for man and woman characters.

PS: Note that 95% of audio is ai gen but there are some phrases from Male character that are no ai gen. I got bored with the project and realized i show it like this or not show at all. Music is Suno. But Sounds audio is not ai!

All my friends say it looks exactly just like real anime and they would never guess it is ai. And it does look pretty close.

submitted by /u/protector111
[link] [comments]

IGORR - ADHD An AI generated music video.

4 Abril 2025 at 08:21
IGORR - ADHD An AI generated music video.

Igorrr's music video for "ADHD" by ‪@meat-dept‬

From Meat-Dept : After "Very Noise", we explored the possibilities of AI for this new Igorrr music video: "ADHD". We embraced almost all existing tools, both proprietary and open source, diverting and mixing them with our 3D tools. This video is a symbolic journey into an experimental therapy for treating a patient with ADHD, brimming with nods to "Very Noise".

We know the use of AI in art might be polemic right now, plus we with Meat Dept actually started the clip in 3D, like we did for Very Noise, but at some point we were laughing so hard trying to do creepy things in AI that the clip ended as a mix of both technologies. The music, however, is 100% homemade.

From Gautier : Kind of an autobiographical piece of music. Starting from one point and moving to another, with no clear link except for the person itself. From simple thoughts, symbolized here as simple dots of sound in the silence, to a complex pathological chaos that somehow still stands. It’s getting worse and worse until the final giant lets go.

submitted by /u/Netsuko
[link] [comments]

Another example of the Hunyuan text2vid followed by Wan 2.1 Img2Vid for achieving better animation quality.

4 Abril 2025 at 13:53
Another example of the Hunyuan text2vid followed by Wan 2.1 Img2Vid for achieving better animation quality.

I saw the post from u/protector111 earlier, and wanted to show an example I achieved a little while back with a very similar workflow.

I also started out with with animation loras in Hunyuan for the initial frames. It involved this complicated mix of four loras (I am not sure if it was even needed) where I would have three animation loras of increasingly dataset size but less overtrained (the smaller hunyuan dataset loras allowed for more stability due in the result due to how you have to prompt close to the original concepts of a lora in Hunyuan to get more stability). I also included my older Boreal-HL lora into as it gives a lot more world understanding in the frames and makes them far more interesting in terms of detail. (You can probably use any Hunyuan multi lora ComfyUI workflow for this)

I then placed the frames into what was probably initially a standard Wan 2.1 Image2Video workflow. Wan's base model actually performs some of the best animation motion out of the box of nearly every video model I have seen. I had to run the wan stuff all on Fal initially due to the time constraints of the competition I was doing this for. Fal ended up changing the underlying endpoint at somepoint and I had to switch to replicate (It is nearly impossible to get any response from FAL in their support channel about why these things happened). I did not use any additional loras for Wan though it will likely perform better with a proper motion one. When I have some time I may try to train one myself. A few shots of sliding motion, I ended up having to run through luma ray as for some reasons it performed better there.

At this point though, it might be easier to use Gen4's new i2v for better motion unless you need to stick to opensource models.

I actually manually did the traditional Gaussian blur overlay technique for the hazy underlighting on a lot of these clips that did not have it initially. One drawback is that this lighting style can destroy a video with low bit-rate.

By the way the Japanese in that video likely sounds terrible and there is some broken editing especially around 1/4th into the video. I ran out of time in fixing these issues due to the deadline of the competition this video was originally submitted for.

submitted by /u/KudzuEye
[link] [comments]

First post here! I mixed several LoRAs to get this style — would love to merge them into one

4 Abril 2025 at 03:24
First post here! I mixed several LoRAs to get this style — would love to merge them into one

Hi everyone! This is my first post here, so I hope I’m doing things right.

I’m not sure if it's okay to combine so many LoRAs, but I kept tweaking things little by little until I got a style I really liked. I don’t know how to create LoRAs myself, but I’d love to merge all the ones I used into a single one.

If anyone could point me in the right direction or help me out, that would be amazing!

Thanks in advance 😊

Workflow:

{Prompt}<lora:TQ_Iridescent_Fantasy_Creations:0.8> <lora:MJ52:0.5> <lora:xl_more_art-full_v1:1> <lora:114558v4df2fsdf5:1> <lora:illustrious_very_aesthetic_v1:0.5> <lora:XXX477:0.2> <lora:sowasowart_style:0.3> <lora:illustrious_flat_color_v2:0.6> <lora:haiz_ai_illu:0.7> <lora:checkpoint-e18_s306:0.75>

Steps: 45, CFG scale: 4, Sampler: Euler a, Seed: 4971662040, RNG: CPU, Size: 720x1280, Model: waiNSFWIllustrious_v110, Version: f2.0.1v1.10.1-previous-659-gc055f2d4, Model hash: c364bbdae9, Hires steps: 20, Hires upscale: 1.5, Schedule type: Normal, Hires Module 1: Use same choices, Hires upscaler: R-ESRGAN 4x+ Anime6B, Skip Early CFG: 0.15, Hires CFG Scale: 3, Denoising strength: 0.35

CivitAI: espadaz Creator Profile | Civitai

submitted by /u/Ztox_
[link] [comments]

Howto guide: 8 x RTX4090 server for local inference

4 Abril 2025 at 02:33
Howto guide: 8 x RTX4090 server for local inference

Marco Mascorro built a pretty cool 8xRTX4090 server for local inference and wrote a pretty detailed howto guide on what parts he used and how to put everything together. Posting here as well as I think this may be interesting to anyone who wants to build a local rig for very fast image generation with open models.

Full guide is here: https://a16z.com/building-an-efficient-gpu-server-with-nvidia-geforce-rtx-4090s-5090s/

Happy to hear feedback or answer any questions in this thread.

PS: In case anyone is confused, the photos show parts for two 8xGPU servers.

submitted by /u/appenz
[link] [comments]

Materia Soup (made with Illustrious / ComfyUI / Inkscape)

4 Abril 2025 at 14:10
Materia Soup (made with Illustrious / ComfyUI / Inkscape)

Workflow is just a regular KSampler / FaceDetailer in ComfyUI with a lot of wheel spinning and tweaking tags.

I wanted to make something using the two and a half years I've spent learning this stuff but I had no idea how stupid/perfect it would turn out.

Full res here: https://imgur.com/a/Fxdp03u
Speech bubble maker: https://bubble-yofardev.web.app/
Model: https://civitai.com/models/941345/hoseki-lustrousmix-illustriousxl

submitted by /u/CrasHthe2nd
[link] [comments]

Bytedance Omnihuman is kinda crazy.

3 Abril 2025 at 21:13
Bytedance Omnihuman is kinda crazy.

Sent this "get well" message to my buddy. Made with Bytedance's Dreamina new "AI Avatar" mode which is using OmniHuman under the hood. I used one of my old Flux images as a starting point.

Unsurprisingly it is heavily censored but still fun nonetheless.

submitted by /u/nootropicMan
[link] [comments]

Comfyui Native Workflow | WAN 2.1 14B I2V 720x720px 65 frames, only 11 minutes gen time with RTX3070 8GB vram

4 Abril 2025 at 12:45

https://reddit.com/link/1jrazzi/video/y536tk3pctse1/player

Hello Everyone,

I created workflow allows you to generate 720x720px videos with 65 frames using WAN 2.1 I2V 14B model in approximately 11 minutes, running on a system with 8GB of VRAM and 16GB of RAM.

Link to workflow: https://brewni.com/Genai/6QE994g2?tag=0

submitted by /u/Sticky_Ray
[link] [comments]

WAN2.1 is paying attention.

4 Abril 2025 at 13:37
WAN2.1 is paying attention.

I thought this was cool. Without prompting for it, WAN2.1 mirrored her movements on the camera view screen.
Using InstaSD's WAN 2.1 I2V 720P – 54% Faster Video Generation with SageAttention + TeaCache ComfyUI workflow.
https://civitai.com/articles/12250/wan-21-i2v-720p-54percent-faster-video-generation-with-sageattention-teacache
Prompt.
Realistic photo, editorial, beautiful Swedish model with ivory skin in voluminous down jacket made of pink and blue popcorn, photographers studio, opening her jacket

RunPod with H100 = 5min render.
1280x720, 30 steps, CFG 7,

submitted by /u/jefharris
[link] [comments]

Could Stable Diffusion Models Have a "Thinking Phase" Like Some Text Generation AIs?

3 Abril 2025 at 19:29
Could Stable Diffusion Models Have a "Thinking Phase" Like Some Text Generation AIs?

I’m still getting the hang of stable diffusion technology, but I’ve seen that some text generation AIs now have a "thinking phase"—a step where they process the prompt, plan out their response, and then generate the final text. It’s like they’re breaking down the task before answering.

This made me wonder: could stable diffusion models, which generate images from text prompts, ever do something similar? Imagine giving it a prompt, and instead of jumping straight to the image, the model "thinks" about how to best execute it—maybe planning the layout, colors, or key elements—before creating the final result.

Is there any research or technique out there that already does this? Or is this just not how image generation models work? I’d love to hear what you all think!

submitted by /u/TheArchivist314
[link] [comments]

Issues finding working AI image generating software for Windows with AMD gpu

4 Abril 2025 at 13:04

Hi everyone,

as mentioned in the title, I tried multiple software for AI image generation. Most of them won't work as they only support AMD with Linux.. And I cannot manage to make Rocm working. The only one I managed to use with little results is Stable Diffusion but as soon as I try to increase some parameters for quality etc, I instantly get VRAM error.
I know most of these programs are optimized for Nvidia cards, but I have a 6950xt with 16gb of VRAM, yet can push parameters like half of what a friend of mine uses with his rtx 2080. Even 1920*1080p generation gives me errore and the results for less are as awful as useless.

Do you know something that's probably working with windows? Cause I really don't want to install Linux.. Regard this last point, will those software work via WSL too or does it have to be an actual Linux installation?

Thanks in advance for any suggestion

submitted by /u/Customer-Artistic
[link] [comments]

Hunyuan pixelated videos

4 Abril 2025 at 14:38

two videos with same settings same wf why this quality difference/pixellation I can send the wf if reddit clears the data on the video

https://reddit.com/link/1jrdgov/video/epbhs34kxtse1/player

https://reddit.com/link/1jrdgov/video/2rfvmmlkxtse1/player

submitted by /u/Far-Reflection-9816
[link] [comments]

I TRAIN FLUX CHARACTER LORA FOR FREE

3 Abril 2025 at 16:57
I TRAIN FLUX CHARACTER LORA FOR FREE

As the title says, i will train FLUX character LORAs for free, you just have to send your dataset (just images) and i will train it for free, here 2 examples of 2 LORAs trained by myself. Contact me via X @ByJayAIGC or Discord: https://discord.gg/sRTNEUGj

submitted by /u/Recent-Percentage377
[link] [comments]
❌
❌