Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
Ayer — 14 Abril 2025StableDiffusion

MineWorld - A Real-time interactive and open-source world model on Minecraft

14 Abril 2025 at 05:50
MineWorld - A Real-time interactive and open-source world model on Minecraft

Our model is solely trained in the Minecraft game domain. As a world model, an initial image in the game scene will be provided, and the users should select an action from the action list. Then the model will generate the next scene that takes place the selected action.

Code and Model: https://github.com/microsoft/MineWorld

submitted by /u/Designer-Pair5773
[link] [comments]

Flux vs Highdream (Blind Test)

13 Abril 2025 at 19:22
Flux vs Highdream (Blind Test)

Hello all, i threw together some "challenging" AI prompts to compare flux and hidream. Let me know which you like better. "LEFT or RIGHT". I used Flux FP8(euler) vs Hidream NF4(unipc) - since they are both quantized, reduced from the full FP16 models. Used the same prompt and seed to generate the images.

PS. I have a 2nd set coming later, just taking its time to render out :P

Prompts included. *nothing cherry picked. I'll confirm which side is which a bit later. although i suspect you'll all figure it out!

submitted by /u/puppyjsn
[link] [comments]

Better prompt adherence in HiDream by replacing the INT4 LLM with an INT8.

14 Abril 2025 at 01:36
Better prompt adherence in HiDream by replacing the INT4 LLM with an INT8.

I replaced hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4 with clowman/Llama-3.1-8B-Instruct-GPTQ-Int8 LLM in lum3on's HiDream Comfy node. It seems to improve prompt adherence. It does require more VRAM though.

The image on the left is the original hugging-quants/Meta-Llama-3.1-8B-Instruct-GPTQ-INT4. On the right is clowman/Llama-3.1-8B-Instruct-GPTQ-Int8.

Prompt lifted from CivitAI: A hyper-detailed miniature diorama of a futuristic cyberpunk city built inside a broken light bulb. Neon-lit skyscrapers rise within the glass, with tiny flying cars zipping between buildings. The streets are bustling with miniature figures, glowing billboards, and tiny street vendors selling holographic goods. Electrical sparks flicker from the bulb's shattered edges, blending technology with an otherworldly vibe. Mist swirls around the base, giving a sense of depth and mystery. The background is dark, enhancing the neon reflections on the glass, creating a mesmerizing sci-fi atmosphere.

submitted by /u/Enshitification
[link] [comments]

What is the best upscaling model currently available?

14 Abril 2025 at 06:30
What is the best upscaling model currently available?

I'm not quite sure about the distinctions between tile, tile controlnet, and upscaling models. It would be great if you could explain these to me.

Additionally, I'm looking for an upscaling model suitable for landscapes, interiors, and architecture, rather than anime or people. Do you have any recommendations for such models?

This is my example image.

https://preview.redd.it/f550z9wevque1.png?width=1280&format=png&auto=webp&s=c9a86504efbd86ec174f0dd27f42d46d65aad5b1

I would like the details to remain sharp while improving the image quality. In the upscale model I used previously, I didn't like how the details were lost, making it look slightly blurred. Below is the image I upscaled.

https://preview.redd.it/t9xiz85wvque1.jpg?width=2560&format=pjpg&auto=webp&s=a2960a8d792d2b360bc6cf0d2e6cb624ea50bcf8

submitted by /u/Disastrous-Cash-8375
[link] [comments]

All generations after the first are extremely slow all of a sudden?

14 Abril 2025 at 12:00

I've been generating fine for the last couple weeks on comfyui, and now all of a sudden every single workflow is absolutely plagued by this issue. It doesn't matter if it's a generic flux on, or a complex Hunyuan one, they're all generating find (within a few minutes) for the first one, and then basically brick my PC on the second

I feel like there's been a windows update maybe recently? Could this have caused it? Maybe some automatic update? I've not updated anything directly myself or fiddled with any settings

submitted by /u/AnonymousTimewaster
[link] [comments]

Flux VS Hidream (Blind test #2)

13 Abril 2025 at 20:49
Flux VS Hidream (Blind test #2)

Hello all, here is my second set. This competition will be much closer i think! i threw together some "challenging" AI prompts to compare Flux and Hidream comparing what is possible today on 24GB VRAM. Let me know which you like better. "LEFT or RIGHT". I used Flux FP8(euler) vs Hidream FULL-NF4(unipc) - since they are both quantized, reduced from the full FP16 models. Used the same prompt and seed to generate the images. (Apologize in advance for not equalizing sampler, just went with defaults, and apologize for the text size, will share all the promptsin the thread).

Prompts included. *nothing cherry picked. I'll confirm which side is which a bit later. Thanks for playing, hope you have fun.

submitted by /u/puppyjsn
[link] [comments]

Looking for Updated Tutorials on Training Realistic Face LoRAs for SDXL (Using Kohya or Other Methods)

14 Abril 2025 at 12:51

It’s been a while since I last worked with SDXL, and back then, most people were using Kohya to train LoRAs. I’m now planning to get back into it and want to focus on creating realistic LoRAs—mainly faces and clothing.

I’ve been searching for tutorials on YouTube, but most of the videos I’ve come across are over a year old. I’m wondering if there are any updated guides, videos, or blog posts that reflect the current best practices for LoRA training on SDXL. I'm planning to use Runpod to train so vram isn't a problem.

Any advice, resources, or links would be greatly appreciated. Thanks in advance for the help!

submitted by /u/Daszio
[link] [comments]

Tested HiDream NF4...completely overhyped ?

13 Abril 2025 at 20:53

I just spent two hours testing HiDream locally running the NF4 version and it's a massive disappointment :

  • prompt adherence is good but doesn't beat dedistilled flux with high CFG. It's nowhere near chatgpt-4o

  • characters look like a somewhat enhanced flux, in fact I sometimes got the flux chin cleft. I'm leaning towards the "it was trained using flux weights" theory

  • uncensored my ass : it's very difficult to have boobs using the uncensored llama 3 LLM, and despite trying tricks I could never get a full nude whether realistic or anime. For me it's more censored than flux was.

Have I been doing something wrong ? Is it because I tried the NF4 version ?

If this model proves to be fully finetunable unlike flux, I think it has a great potential.

I'm aware also that we're just a few days after the release so the comfy nodes are still experimental, most probably we're not tapping the full potential of the model

submitted by /u/Tablaski
[link] [comments]

Is it possible to use generative models to upscale videos?

14 Abril 2025 at 08:04

Haven't been using any models for 2-ish years, so my knowledge is very outdated.
So can I feed a video into a model and get it to upscale from 240p to 4k? Topaz Video AI does a terrible job in such cases, that's why I'm asking.

submitted by /u/DeviantPlayeer
[link] [comments]

Where are the HiDream Models saved?

14 Abril 2025 at 06:47

HI, I'm about to make some tests with HiDream and as the node is quite a black box it seems that I have downloaded all options. As I will be able to use only the quantized versions I try to find the place where the models are stored in order to delete them.
Would be nice to get a better insight what that node is doing behind the scene.

submitted by /u/MakeParadiso
[link] [comments]

Lora training help needed. Tag vs caption.

14 Abril 2025 at 12:11

Asked GPT, it stated it depends on the clip if it works best with tags or captions. As i'm trying to train very abstract features on people. It worked perfect with captions on first try with flux, but slowly learning sdxl with rank, learning rates and whatnot to achieve the same results on sdxl as well.

As pony/sdxl base/illustrous and so on trains better on tag vs caption, and vice versa.

So without a hallucinating dumb bot, how does one properly train a sdxl/sd 1.5 lora?

submitted by /u/Duckers_McQuack
[link] [comments]

Same element, different ambient

14 Abril 2025 at 12:01

Hello! I need to find a way to take a still image (of a house, for example) and make changes to it: day, night, snowing... I've tried with controlnet, img2img, inpainting... combining all of them... but I can't do it.

Can you think of how I can do it? I always end up changing the texture of the wall of the house, or key elements that shouldn't change.

Thank you!

submitted by /u/ricardonotion
[link] [comments]
❌
❌