Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
Hoy — 3 Octubre 2024StableDiffusion

Has anyone successfully acquired a commercial license for FLUX.1-dev?

3 Octubre 2024 at 10:47

I'm working on a commercial product that currently makes use of SD 1.5, and some of its capabilities that are made possible by its open source nature & community ecosystem - e.g. controlnets, finetunes, LoRAs, etc. SD 1.5 is no longer state of the art though, and we're due an upgrade.

FLUX seems to be the model family of choice in the community at the moment, and playing with [dev] for personal projects, it seems to be at the point where it can achieve all of the features we'd need, with better results than we can currently get from SD 1.5.

FLUX.1-dev seems to be the goldilocks model in this family - [schnell] is a bit more limited, and hasn't gained much traction in the community as everyone is focussing efforts on dev. While [pro] is a bigger, more capable model, that can produce better quality images - in many ways it's even more limited than [schnell], as it's only available through an API that offers very limited control over the generation.

It seems like [dev] would be the ideal model for our purposes, but obviously the non-commercial license keeps us from just jumping right in to using it. However, Black Forest Labs do say that you can get in touch to discuss commercial licensing. We haven't started the ball rolling on that yet - partially because we're still in the model evaluation phase, but also because in my experience, when the licensing / pricing model is "contact us", that usually means we'll be tying up some legal/financial people in a negotiation, as well as developer time in documenting requirements, volume, etc.

The terms state:

you must request a license from Company, which Company may grant to you in Company’s sole discretion and which additional use may be subject to a fee, royalty or other revenue share.

Which doesn't give much clarity on what a commercial agreement might look like, or how likely they are to issue one. I guess this probably varies case-by-case, so it makes sense for them to keep the terms vague to cover different types of agreements they might make under different scenarios. It seems plausible that we might get in touch, just to find out that they only intend on issuing a handful of licenses and we don't qualify - or that the pricing model wouldn't make sense for us. For example, a revenue sharing model probably wouldn't be viable for us, as generative imagery is a relatively minor aspect of our platform, which is sold with enterprise pricing. A royalty model likely wouldn't work for us either, as the outputs are something our customers would be using commercially, and our existing contracts allow unrestricted use of outputs (and I don't see any new customers being willing to sign without that provision).

So, before I recommend kicking off a discovery process that drags a bunch of people into negotiations:

Has anyone here successfully obtained a commercial license for FLUX.1-dev? If so, what did that process look like, and how long did it go on for? Are there any broad indications of the pricing model you're able to share?

submitted by /u/dismantlemars
[link] [comments]

Is there a text-based guides to learning the absolute basics and fundamentals of SD?

3 Octubre 2024 at 06:53

I want to learn how everything I use actually works, down to the nitty-gritty details. I'm interested in understanding the mechanics of an upscaler, how a sampler functions, what a VAE is, what masking entails, and what a detailer does. I want to grasp how these concepts work.

I'm also curious about why Flux is a U-Net model while others like DreamMaker are checkpoints, even though they both generate images. I want to learn all of this, but I can't seem to find any comprehensive text guides. I strongly prefer reading to watching videos, and I only tolerate videos from experts or founders in the field, like Andrej Karpathy for LLMs.

Is there a similar authoritative figure for image generation? Alternatively, is there a well-written, end-to-end text guide that teaches image generation concepts thoroughly?

submitted by /u/CaptTechno
[link] [comments]

Powder - A text-to-image workflow for better skin in high Flux guidance images

2 Octubre 2024 at 23:34
Powder - A text-to-image workflow for better skin in high Flux guidance images

https://preview.redd.it/hblkrww7ffsd1.png?width=1024&format=png&auto=webp&s=b2381d638d7321df18fa423c3d0213b13391deef

Powder workflow for ComfyUI is available here

I found a way of using masked conditioning for part of the image inference to combine high and low Flux guidance into a single pass text-to-image workflow. I use this to remove the waxy look of skin textures for photorealistic portraits in Flux Dev, where the overall image needs to use high Flux guidance for good prompt adherence or character Lora likeness.

Please give Powder a try!

Instructions are in the workflow, I've copied them below:

Powder is a single-pass text-to-image workflow for Flux.1 [dev] based checkpoints. It is designed for photorealistic portraits that require high Flux guidance (3.5 or above) for the overall image. It aims to improve skin contrast and detail, avoiding the shiny, waxy, smoothed look.

High Flux guidance is required for good prompt adherence, image composition, colour saturation and close likeness with character Loras. Lower Flux guidance (1.7 to 2.2) improves skin contrast and detail but loses the mentioned benefits of high guidance for the overall image. Powder uses masked conditioning with varied Flux guidance according to a 3 phase schedule. It also uses masked noise injection to add skin blemishes. It can be run completely automatically, though there is a recommended optional step to manually edit the skin mask. Powder can be used with any Loras and controlnets that work with a standard KSampler, but it does not work with Flux.1 [schnell].

Powder uses an Ultralytics detector for skin image segments. Install the detector model using ComfyUI Manager > Model Manager and search for skin_yolov8n-seg_800.pt

Image inference uses a KSampler as usual, but the scheduled steps are split into 3 phases:

  • Phase 1: Each KSampler step uses a single (high) Flux guidance value for the whole image.
  • Phase 2: Latent noise is injected into the masked region. Then, inference proceeds like in Phase 1, except for a different (lower) Flux guidance value used for the masked region.
  • Phase 3: Similar to Phase 2, but using different settings for the injected noise and Flux guidance value applied to the masked region.

At the end of Phase 1, the workflow pauses. Right-click on the image in "Edit skin mask" and select "Open in MaskEditor". The image will be fuzzy because it is not fully resolved, but its composition should be apparent. A rough mask will have been automatically generated. The mask should cover skin only; ensure hair, eyes, lips, teeth, nails and jewellery are not masked. Make any corrections to the mask and click "Save to node". Queue another generation, and the workflow will complete the remaining phases.

To make a new image, click "New Fixed Random" in the "Seed - All phases" node before queueing another generation.

Tips:

  • "Schedule steps" is the total number of steps used for all phases. This should be at least 40; I recommend 50.
  • "Phase 1 steps proportion" ranges from 0 to 1 and controls the number of steps in Phase 1. Higher numbers ensure the image composition more closely matches a hypothetical image generated purely using the Flux guidance value for Phase 1, but at the cost of fewer steps in Phases 2 and 3 to impact the masked region. 0.24 seems to work well; for 50 schedule steps this gives 0.24 * 50 = 12 steps for Phase 1.
  • "Flux guidance - Phase 1" should be at least 3.5 for good prompt adherence, well-formed composition of all objects in the image, aesthetic colour saturation and good likeness when using character Loras.
  • You may need to experiment with "Flux guidance (masked) - Phases 2/3" settings to work well with your choice of checkpoint and style Lora, if any.
  • Latent noise is added to the masked region at the start of Phases 2 and 3. The noise strengths can be adjusted in the "Inject noise - Phase 2/3" nodes to vary the level of skin blemishes added.
  • To skip mask editing and use the automatically generated mask each time, click on "block" in the "Edit skin mask" node to select "never".
  • Consider excluding fingers or fingertips from the mask, particularly small ones. Images of fingers and small objects at lower Flux guidance are often posed incorrectly or crumble into a chaotic mess.
  • Feel free to change the sampler and scheduler. I find deis / ddim_uniform works well, as it converges sufficiently for Phase 1.
  • After completing all phases to generate a final image, you may fine-tune the mask by pasting the final image into the "Preview Bridge - Phase 1" node. To do this, right-click on "Preview Image - Powder" (right of this node group) and select "Copy (Clipspace)". Then right-click on "Preview Bridge - Phase 1" and select "Paste (Clipspace)". Queue a generation for a mask to be automatically generated and edit the mask as before. Then, queue another generation to restart the process from Phase 2.
  • Images should be larger than 1 megapixels in area for good results. I often use 1.6 megapixels.
  • Consider using a finetuned checkpoint. I find Acorn is Spinning gives good realistic results. https://civitai.com/models/673188?modelVersionId=757421
  • Use Powder as a first step in a larger workflow. Powder is not designed to generate final completed images.
  • Not every image can be improved satisfactorily. Sometimes a base image will be so saturated or lacking detail that it cannot be salvaged. Just reroll and try again!
submitted by /u/SteffanWestcott
[link] [comments]

OpenFLUX.1 - Distillation removed - Normal CFG FLUX coming - based on FLUX.1-schnell

2 Octubre 2024 at 14:26

ComfyUI format from Kijai (probably should work with SwarmUI as well) : https://huggingface.co/Kijai/OpenFLUX-comfy/blob/main/OpenFlux-fp8_e4m3fn.safetensors

The below text quoted from resource : https://huggingface.co/ostris/OpenFLUX.1

Beta Version v0.1.0

After numerous iterations and spending way too much of my own money on compute to train this, I think it is finally at the point I am happy to consider it a beta. I am still going to continue to train it, but the distillation has been mostly trained out of it at this point. So phase 1 is complete. Feel free to use it and fine tune it, but be aware that I will likely continue to update it.

What is this?

This is a fine tune of the FLUX.1-schnell model that has had the distillation trained out of it. Flux Schnell is licensed Apache 2.0, but it is a distilled model, meaning you cannot fine-tune it. However, it is an amazing model that can generate amazing images in 1-4 steps. This is an attempt to remove the distillation to create an open source, permissivle licensed model that can be fine tuned.

How to Use

Since the distillation has been fine tuned out of the model, it uses classic CFG. Since it requires CFG, it will require a different pipeline than the original FLUX.1 schnell and dev models. This pipeline can be found in open_flux_pipeline.py in this repo. I will be adding example code in the next few days, but for now, a cfg of 3.5 seems to work well.

submitted by /u/CeFurkan
[link] [comments]

How are some people able to use EasyNegative on PonyDiffusion V6??

3 Octubre 2024 at 14:05

Hello everyone.

I've been seraching for ways to up my prompting and overall picture quality using PonyDiffusion V6. On civitai.com I've looked through a lot of pictures, really nice looking ones, and came to see that most of them seem to use the EasyNegative embedding.

Now, I know that EasyNegative is ment for 1.5 and not SDXL, however, these folks seem to make it work. I've downloaded the EasyNegative safetensors and the pt files, but none of them seem to work.

So, my question is, is there anyone willing to help a noob out and tell me how I can manage to use EasyNegative with PonyDiffusion v6?

submitted by /u/Malgosh
[link] [comments]
❌
❌