DiffusionDigest: The Prodigal Son Returns, SD3's Civitai Hurdles, SD3 Best Practices & Runway's Gen-3 Debut (June 23, 2024)

24 Junio 2024 at 10:12

🎨 Welcome to DiffusionDigest for the week of June 16, 2024! In this jam-packed issue, we dive into the ComfyUI creator's new venture, Stable Diffusion 3's licensing drama and best practices, Stability AI’s New CEO, Runway's mind-blowing Gen-3 Alpha model, and more exciting AI advancements!

🚀 ComfyUI Creator Resigns, Founds Comfy Org

comfyanonymous, the creator of the popular ComfyUI, has announced his resignation from Stability AI to embark on a new venture called Comfy Org. Joining forces with a team of developers including mcmonkey4eva, Dr.Lt.Data, pythongossssss, robinken, and yoland68, Comfy Org aims to:

🤝 Establish ComfyUI as the leading free, open-source software for AI model inference

🔧 Prioritize development for image, video, and audio models

📈 Enhance user experience and improve safety standards for custom nodes

Source.

🚨 Stability AI Appoints New CEO Amid Funding Concerns

Prem Akkaraju, former CEO of Weta Digital, has been appointed as the new CEO of Stability AI. A group of investors, including former Facebook President Sean Parker, is providing additional funding to help the cash-strapped company. This change in leadership and the involvement of Akkaraju, given his background in the VFX industry, has led to speculation about a potential shift in Stability AI's strategy towards proprietary AI tools for the entertainment industry. The company's decision to decline comment on the matter has led some users to believe that Stability AI is in "deep crisis mode" and might not continue with its open-source approach.

Source.

⚠️ SD3 Banned from Civitai Due to Licensing Issues

Civitai, a popular AI art platform, has temporarily banned Stable Diffusion 3 (SD3) models due to concerns about the restrictive nature of the SD3 license, which could grant Stability AI too much control over the use of models fine-tuned on SD3.

💬 The decision has sparked a discussion about the importance of clear and permissive licensing in the AI art community. Many users support Civitai's move, expressing disappointment in Stability AI's handling of the SD3 release.

❓ There are concerns about the future of Stability AI, with speculation about the company's financial health and the possibility of acquisition. This uncertainty highlights the need for open communication between model providers and the community.

🤝 The co-founder of Stability AI, Emad Mostaque, suggested rolling back to the prior license as a solution, indicating a willingness to address the community's concerns.

Source.

📝 SD3 Best Practices: Optimizing Results and Avoiding Pitfalls

As users experiment with the new Stable Diffusion 3 model, it's essential to understand the best practices and potential pitfalls. Here are some key tips:

Best Practices:

Use the FP16 version of the SD3 checkpoint for smoother results
Ensure latent image dimensions are multiples of 64
Stick with compatible samplers like Euler, DPM++ 2M, and DimUniPC
Use plain English sentences in prompts, focusing on the most difficult elements first
Experiment with different prompts for the CLIP and T5 text encoders
Try the dpmpp_2m sampler with the sgm_uniform scheduler as a starting point
Aim for image resolutions around 1 megapixel for best quality
Experiment with the "shift" parameter to balance composition messiness and tidiness

Worst Practices:

Don't rely on negative prompts, as SD3 largely ignores them
Avoid stochastic samplers, which are incompatible with SD3
Don't expect SD3 to handle sensitive content well out-of-the-box
Refrain from using excessively high CFG values to prevent "burnt" looking images

For more detailed best practices and settings recommendations, check out Matteo’s video, and this article authored by Replicate.

🎥 Runway Unveils Gen-3 Alpha: A Leap Forward in Video Generation

Runway has introduced Gen-3 Alpha, a major improvement over its previous generation in terms of fidelity, consistency, and motion. Trained jointly on videos and images, Gen-3 Alpha enables fine-grained temporal control, allowing users to precisely key-frame elements in a scene based on dense captions.

👥 Excels at generating expressive photorealistic humans

⏩ Faster generation times: 5 seconds in 45 seconds, 10 seconds in 90 seconds

🔐 Improved visual moderation system and C2PA provenance standards

💡 Powers all of Runway's existing modes and enables new features

Gen-3 Alpha represents a significant step towards building General World Models, offering more fine-grained control over structure, style, and motion in AI-generated videos.

Source.

🆕 Exciting New Developments: LI-DiT-10B, MeshAnything, and 2DN-Pony

LI-DiT-10B: LLM-Infused Diffusion Transformer (LI-DiT), a framework that enhances text representation for prompt encoding in text-to-image diffusion models. LI-DiT addresses key challenges like misalignment of training objectives and positional bias in LLMs, leading to significant improvements in prompt comprehension and image quality compared to models like Stable Diffusion 3, DALL-E 3, and Midjourney V6. An API is set to release next week.

MeshAnything: a new AI model that generates artist-quality 3D meshes with good topology, conditioned on input shapes. While currently limited to low poly counts (fewer than 800 faces), and a restrictive license - the model shows exciting progress in making 3D asset creation more accessible to non-artists.

2DN-Pony: a new Stable Diffusion XL (SDXL) model that generates both 2D anime style and more realistic 3D style images, aiming for an aesthetic between flat 2D and full realism. Based on Pony Diffusion, the model requires special prompt tags and benefits from negative prompts to achieve its unique look.

That's it for this weeks's DiffusionDigest! Stay tuned for more exciting updates and insights into the world of stable diffusion and generative AI. If you have any questions, feedback, or suggestions for future topics, feel free to reach out.

Happy generating!

submitted by /u/OkSpot3819
[link] [comments]

Vista de Lectura

🎨 DiffusionDigest: Open Model Initiative Takes Shape, PixArt Joins Forces with NVIDIA, SD3 License Complexities (June 30, 2024)

DiffusionDigest: The Prodigal Son Returns, SD3's Civitai Hurdles, SD3 Best Practices & Runway's Gen-3 Debut (June 23, 2024)