Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
Ayer — 9 Mayo 2025StableDiffusion

ICEdit, I think it is more consistent than GPT4-o.

9 Mayo 2025 at 12:54
ICEdit, I think it is more consistent than GPT4-o.

In-Context Edit, a novel approach that achieves state-of-the-art instruction-based editing using just 0.5% of the training data and 1% of the parameters required by prior SOTA methods.
https://river-zhang.github.io/ICEdit-gh-pages/

I tested the three functions of image deletion, addition, and attribute modification, and the results were all good.

submitted by /u/Some_Smile5927
[link] [comments]

I give up

9 Mayo 2025 at 12:07

When I bought the rx 7900 xtx, I didn't think it would be such a disaster, stable diffusion or frame pack in their entirety (by which I mean all versions from normal to fork for AMD), sitting there for hours trying. Nothing works... Endless error messages. When I finally saw a glimmer of hope that it was working, it was nipped in the bud. Driver crash.

I don't just want the Rx 7900 xtx for gaming, I also like to generate images. I wish I'd stuck with RTX.

This is frustration speaking after hours of trying and tinkering.

Have you had a similar experience?

submitted by /u/Skara109
[link] [comments]

[Industry Case Study & Open Source] Real-World ComfyUI Workflow for Garment Transfer—Breakthroughs in Detail Restoration

9 Mayo 2025 at 08:58
[Industry Case Study & Open Source] Real-World ComfyUI Workflow for Garment Transfer—Breakthroughs in Detail Restoration

When we applied ComfyUI for clothing transfer in a clothing company, we encountered challenges with details such as fabric texture, wrinkles, and lighting restoration. After multiple rounds of optimization, we developed a workflow focused on enhancing details, which has been open-sourced. This workflow performs better in reproducing complex patterns and special materials, and it is easy to get started with. We welcome everyone to download and try it, provide suggestions, or share ideas for improvement. We hope this experience can bring practical help to peers and look forward to working together with you to advance the industry.
Thank you all for following my account, I will keep updating.
Work Address:https://openart.ai/workflows/flowspark/fluxfillreduxacemigration-of-all-things/UisplI4SdESvDHNgWnDf

submitted by /u/Lazy_Lime419
[link] [comments]

Civitai is taken over by Openai generations and I hate it

8 Mayo 2025 at 16:15

nothing wrong with openai, its image generations are top notch and beautiful, but I feel like ai sites are deluting the efforts of those who wants AI to be free and independent from censorship...and including Openai API is like inviting a lion to eat with the kittens.

fortunately, illustrious (majority of best images in the site) and pony still pretty unique in their niches...but for how long.

submitted by /u/Dear-Spend-2865
[link] [comments]

I made an app to catalogue safetensor files

9 Mayo 2025 at 03:00

So since I just found out what LoRAs are I have been downloading them like a mad man. However, this makes it incredibly difficult to know what LoRA does what when you look at a directory with around 500 safetensor files in it. So I made this application that will scan your safetensor folder and create an HTML page in it that when you open up, shows all the safetensor thumbnails with the names of the files and the thumbnails are clickable links that will take you to their corresponding CivitAI page, if they are found to be on there. Otherwise not. And no thumbnail.

I don't know if there is already a STANDALONE app like this but it seemed easier to make it.
You can check it out here:
https://github.com/petermg/SafeTensorLibraryMaker

submitted by /u/omni_shaNker
[link] [comments]

QLoRA training of HiDream (60GB -> 37GB)

QLoRA training of HiDream (60GB -> 37GB)

https://preview.redd.it/cetdxep6hoze1.png?width=800&format=png&auto=webp&s=e58f780eb2a38bc9af56b54084c1628ed9fded0a

Fine-tuning HiDream with LoRA has been challenging because of the memory constraints! But it's not right to let that come in the way of this MIT model's adaptation. So, we have shipped QLoRA support in our HiDream LoRA trainer 🔥

The purpose of this guide is to show how easy it is to apply QLoRA, thanks to the PEFT library and how well it integrates with Diffusers. I am aware of other trainers too, who offer even lower memory, and this is not (by any means) a competitive appeal to them.

Check out the guide here: https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/README_hidream.md#using-quantization

submitted by /u/RepresentativeJob937
[link] [comments]

New option for Noob-AI v-pred! Lora to offset oversaturation

8 Mayo 2025 at 17:05

https://civitai.com/models/1555532/noob-ai-xl-v-pred-stoopid-colorfix

Use it with negative weight to cancel oversaturation.

There are multiple way to make this v-pred model work. I don't like going low on CFG with ancestral scheduler. Just don't like the result. I like it when there are plenty of details, so my go to is CFG 5.5 with RescaleCFG 0.7, DDPM SGM Uniform. And bunch of other stuff, but it does not really matter. Yet, I've alwaysed used at least one style lora to offset relatively odd skin color when using some style tags. And didn't really like backgrounds.

But sometimes it produced really broken images. This comes to a lot of artist tags, some prompts. Sometimes excessive negative also lead to oversaturation. Sometimes it is cool because with this overabundance of specific color can give you truly interesting results, like true dark imagery where sdxl always stuggled. But sometimes I don't want that black on black or whatever.

I also noticed that even if not completely frying the image it still affects a lot of artist tags and destroys backgrounds. There are ways around it, but I've always added at least one style lora at low weight to get rid of those weird skin tones. But then I tried some VelvetS loras and they gave me monstrocities or full on furries for no apparent reason 🤣 Turns out fried skin was picked as scales, fur etc. And this model knows where to end that.

For over a month I was thinking in the back of my head: "Try this. Yes, it is stupid, but why not. Just do it."

And I tried. And it worked. I embraced oversaturation. I cranked it to max on all basic color spectrum. I made a lora that makes any image a completely monochrome color sheet. And now you can use it with negative weight to offset this effect.

Tips on usage:

Oversaturation is not distributed equally across model, so there is no single good weight. It is affected by color tags and even length of negative prompt. -1 is generally safe across the board, but on some severe cases I had to go for -6. Check comparisons to get a gist of it.

Simple prompts still tend to fall into same pattern blue theme > red theme > white theme etc. Prompt more, add various colors to the prompt. This innate feature of this model, no need to battle it.

Add sketch, monochrome, partially colored to negative.

Last but not least.

Due to the way it works, at negative value this lora negates uniformal colored patches, effectively adding details. Completely random details. Consider massive duplications etc. To battle this use my Detailer lora. It stabilizes details greatly and is full on v-pred. Or use some other stabiliser you like, I never tested them since my detailer does that anyways and does not alter style in the process.

This is just another option to have in your toolkit. You can go higher CFG and fix backgrounds and some artist tags with this. This does not offset cfg rebalancing nodes, they are still needed.

If you check image 4 I am not even sure I can call initial result a bug or feature. This is quite a specific 1girl 🤭

submitted by /u/shapic
[link] [comments]

Whispers from Depth

9 Mayo 2025 at 08:59
Whispers from Depth

This video was created entirely using generative AI tools. It's in a form of some kind of trailer for upcoming movie. Every frame and sound was made with the following:

ComfyUI, WAN 2.1 txt2vid, img2vid, and the last frame was created using FLUX.dev. Audio was created using Suno v3.5. I tried ACE to go full open-source, but couldn't get anything useful.

Feedback is welcome — drop your thoughts or questions below. I can share prompts. Workflows are not mine, but normal standard stuff you can find on CivitAi.

submitted by /u/BiceBolje_
[link] [comments]

What's going on with PixArt

8 Mayo 2025 at 21:20

Few weeks ago I found out about PixArt, downloaded the Sigma 2K model and experimented a bit with it. I liked it's results. Just today I found out that Sigma is a year old model. I went to see what was happening in PixArt after this model and it seems that their last commits are around May 2024. I saw some reddit post from September with people saying that there should be a new pixart model in September that is supposed to be competitive with Flux. Well, it's May 2025 and nothing has been released as far as I know. Does someone know what is happening in PixArt? Are they still working on their model or are they off the industry or something?

submitted by /u/Qbsoon110
[link] [comments]

Banana Overdrive

8 Mayo 2025 at 21:42
Banana Overdrive

This has been a wild ride since WAN 2.1 came out. I used mostly free and local tools, except for Photoshop (Krita would work too) and Suno. The process began with simple sketches to block out camera angles, then I used Gemini or ChatGPT to get rough visual ideas. From there, everything was edited locally using Photoshop and FLUX.

Video generation was done with WAN 2.1 and the Kijai wrapper on a 3090 GPU. While working on it, new things like TeachCache, CFG-Zero, FRESCA or SLG kept popping up, so it’s been a mix of learning and creating all the way.

Final edit was done in CapCut.

If you’ve got questions, feel free to ask. And remember, don’t take life too seriously... that’s the spirit behind this whole thing. Hope it brings you at least a smile.

submitted by /u/NebulaBetter
[link] [comments]

We created the first open source multiplayer world model with just $1.5K

8 Mayo 2025 at 17:55

We've built a world model that allows two player to race each other on the same track.

The research and training cost was under $1.5K — made possible through focused engineering and innovation, not massive compute. You can even run it on a standard gaming PC!

We’re open-sourcing everything: the code, data, weights, architecture, and research.

Try it out: https://github.com/EnigmaLabsAI/multiverse/

Get the model and datasets: https://huggingface.co/Enigma-AI

And read about the technical details here: https://enigma-labs.io/

submitted by /u/EnigmaLabsAI
[link] [comments]
❌
❌