Vista de Lectura

Hay nuevos artículos disponibles. Pincha para refrescar la página.

Cómo Usar el CHATBOT de Gemini por VOZ ✨IA de Google + 11Labs

💾

Link a Eleven Labs: https://try.elevenlabs.io/inicio En este tutorial te enseño cómo usar y configurar paso a paso un asistente virtual de IA por voz, también conocido como chatbot, empleando la tecnología LLM de Gemini 1.5 desarrollado por Google. Mucho más fácil, rápido y cómodo que Chat GPT, con ajustes y voces personalizables en un estilo natural. Además de esta función de "Conversación con IA", Eleven Labs tiene decenas de aplicaciones para el tratamiento del sonido, desde la lectura automática de textos a la clonación de voces, pasando por la locución de artículos en webs y blogs, la disociación de voz y música, la traducción automática de vídeos a diferentes idiomas, y muchas otras herramientas que puedes emplear automatizar procesos, mejorar tus proyectos e incluso monetizarlos, si tienes ideas de negocio de los que quieras hacer dinero.

Someone leaked an API to Sora on HuggingFace( it has been suspended already)

Here's the link https://huggingface.co/spaces/PR-Puppets/PR-Puppet-Sora

He're the manifesto in case the page is going to be deleted

┌∩┐(◣◢)┌∩┐ DEAR CORPORATE AI OVERLORDS ┌∩┐(◣◢)┌∩┐

We received access to Sora with the promise to be early testers, red teamers and creative partners. However, we believe instead we are being lured into "art washing" to tell the world that Sora is a useful tool for artists.

Hundreds of artists provide unpaid labor through bug testing, feedback and experimental work for the program for a $150B valued company. While hundreds contribute for free, a select few will be chosen through a competition to have their Sora-created films screened — offering minimal compensation which pales in comparison to the substantial PR and marketing value OpenAI receives.

▌║█║▌║█║▌║ DENORMALIZE BILLION DOLLAR BRANDS EXPLOITING ARTISTS FOR UNPAID R&D AND PR ║▌║█║▌║█║▌

Furthermore, every output needs to be approved by the OpenAI team before sharing. This early access program appears to be less about creative expression and critique, and more about PR and advertisement.

[̲̅$̲̅(̲̅ )̲̅$̲̅] CORPORATE ARTWASHING DETECTED [̲̅$̲̅(̲̅ )̲̅$̲̅]

We are releasing this tool to give everyone an opportunity to experiment with what ~300 artists were offered: a free and unlimited access to this tool.

We are not against the use of AI technology as a tool for the arts (if we were, we probably wouldn't have been invited to this program). What we don't agree with is how this artist program has been rolled out and how the tool is shaping up ahead of a possible public release. We are sharing this to the world in the hopes that OpenAI becomes more open, more artist friendly and supports the arts beyond PR stunts.

We call on artists to make use of tools beyond the proprietary:

Open Source video generation tools allow artists to experiment with the avant garde free from gate keeping, commercial interests or serving as PR to any corporation. We also invite artists to train their own models with their own datasets.

Some open source video tools available are: Open Source video generation tools allow artists to experiment with avant garde tools without gate keeping, commercial interests or serving as a PR to any corporation. Some open source video tools available are:

CogVideoX

Mochi 1

LTX Video

Pyramid Flow

However, as we are aware not everyone has the hardware or technical capability to run open source tools and models, we welcome tool makers to listen to and provide a path to true artist expression, with fair compensation to the artists.

Enjoy,

some sora-alpha-artists, Jake Elwes, Memo Akten, CROSSLUCID, Maribeth Rauh, Joel Simon, Jake Hartnell, Bea Ramos, Power Dada, aurèce vettier, acfp, Iannis Bardakos, 204 no-content | Cintia Aguiar Pinto & Dimitri De Jonghe, Emmanuelle Collet, XU Cheng

submitted by /u/Querens
[link] [comments]

Food Photography (Prompts Included)

Food Photography (Prompts Included)

I've been working on prompts to achieve photorealistic and super-detailed food photos uisnf Flux. Here are some of the prompts I used, I thought some of you might find them helpful:

A luxurious chocolate lava cake, partially melted, with rich, oozy chocolate spilling from the center onto a white porcelain plate. Surrounding the cake are fresh raspberries and mint leaves, with a dusting of powdered sugar. The scene is accented by a delicate fork resting beside the plate, captured in soft natural light to accentuate the glossy texture of the chocolate, creating an inviting depth of field.

A tower of towering mini burgers made with pink beetroot buns, filled with black bean patties, vibrant green lettuce, and purple cabbage, skewered with colorful toothpicks. The burgers are served on a slate platter, surrounded by a colorful array of dipping sauces in tiny bowls, and warm steam rising, contrasting with a blurred, lively picnic setting behind.

A colorful fruit tart with a crisp pastry crust, filled with creamy vanilla custard and topped with an assortment of fresh berries, kiwi slices, and a glaze. The tart is displayed on a vintage cake stand, with a fork poised ready to serve. Surrounding it are scattered edible flowers and mint leaves for contrast, while the soft light highlights the glossy surface of the fruits, captured from a slight overhead angle to emphasize the variety of colors.

submitted by /u/Vegetable_Writer_443
[link] [comments]

Looking for volunteers for 4090 compute time

I'm cleaning up the CC12m dataset. I've gotten it down to 8.5 million by handpruning things, but it wasnt as effective as I'd hoped, so I'm falling back to VLM assistance, to get rid of 99% of the watermarks in it.

Trouble is, going through a subset of just 2 million, is going to take 5 days on my 4090.
It averages 5 images per second, or 18,000 an hour. Or, 400,000 in one day.

Would anyone like to step up and contribute some compute time?
You will, if you choose, get mentioned in the credit section of the resulting dataset.

There should be around 5 million images left after my run.
You are free to process any number of 1million image segments that you wish.

(you may even try it on a lesser card. Do note that the VLM takes at least 16gb vram to run though)

submitted by /u/lostinspaz
[link] [comments]

Possibility to train LoRA for Shuttle 3 Diffusion?

Hi,

(just a quick fyi, since it looks like many people are confusing the two, I am talking about SHUTTLE 3 DIFFUSION, not STABLE DIFFUSION 3)

I have been using OneTrainer to train character specific LoRA's for the Flux-Dev Base Model but the results are worse than what I have been getting from my SDXL LoRA's.

I wanted to try and set Shuttle 3 Diffusion as my base checkpoint to train my LoRA on as I had very good results with that model. I downloaded the Huggingface repo and used the same settings which are working for the Flux Dev base model but when I select the Shuttle 3 Diffusion hugging face repository, I get following error:

Traceback (most recent call last): File "G:\90_AI\StableDiffusion\OneTrainer\modules\ui\TrainUI.py", line 561, in __training_thread_function trainer.train() File "G:\90_AI\StableDiffusion\OneTrainer\modules\trainer\GenericTrainer.py", line 674, in train model_output_data = self.model_setup.predict(self.model, batch, self.config, train_progress) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "G:\90_AI\StableDiffusion\OneTrainer\modules\modelSetup\BaseFluxSetup.py", line 475, in predict guidance=guidance.to(dtype=model.train_dtype.torch_dtype()), ^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'to'

I am not very good in Python but I can see that the error must be related to the guidance and since I guess Shuttle 3 Diffusion being a Schnell based model, its missing the guidance stuff.

Does anyone know a way around it? Or what is the best way to train a LoRA using a Schnell Checkpoint as Base model? Or am I doing something wrong? Is there a way?

Thanks a lot

submitted by /u/Reasonable_Net_6071
[link] [comments]

Deep Dive Into Kaiber's Powerful Creative AI Playground!

💾

Let's dive into the Kaiberr's AI Powerhouse, SuperStudio! Whether you’re new to AI tools or a seasoned pro, this video explores the incredible creative potential of Kaiber’s canvas-based design. From generating hyper-realistic images to punk-inspired character concepts, I’ll walk you through the ins and outs of this groundbreaking AI platform. ⚡ Key Highlights • Understanding the new canvas • Exploring creative templates and Image Lab modules • Generating stunning visuals and animations with audio-reactive features • Concept building: from cyberpunk to steampunk, and even punk-punk! 🧠 Why Kaiber 2.0 is a Game Changer Whether you’re creating magical punk rock bands or futuristic cityscapes, Kaiber’s ever-evolving platform delivers endless possibilities. Stick around as I showcase how the stencil feature can transform your characters and discuss the impact of its reintroduced audio reactivity. LINKS: KAIBER: https://kaiber.ai/superstudio?via=tim KAIBER TEMPLATES - https://play.superstudio.app/creative-templates?via=tim Previous Video On Kaiber: https://youtu.be/ZjsF64EFrHs Chapters 0:00 Intro 0:59 Kabier's New Canvas: 1:37 Generating Stunning Images with Flows 02:29 - Generating Video In Kaiber 03:16 - Getting Weird with Kaiber and Arcane 03:45 - Creating a Character in Kaiber 04:10 - Blending Aesthetics in Kaiber 04:48 - Blending Moods in Kaiber 05:36 - Upscaling Images In Kaiber 06:11 - Using Kaiber as a Creative World Builder 07:07 - Filling out the rest of the Characters 07:27 - Posing Your Characters in Kaiber - Super Powerful! 08:42 - Animating Your Posed Characters 08:56 - Audio Reactivity In Kaiber 09:47 - Generating a Title Sequence In Kaiber 10:14 - Final Thoughts & Getting Started with Kaiber
❌