We've got big news about Tencent's open-source multimodal video model, Hunyuan Custom, and its upcoming Open Source Day! Then, I'll give you a quick refresher on what exactly are multimodal models and how they differ from typical diffusion process AI generation – they're a real game-changer.
We'll then get a little technical (but I'll keep it breezy!) as I walk you through how Hunyuan Custom actually generates video, from reference images and text prompts to the magic behind VAE, LLaVA, and UniDiffuser video. This is where it gets really interesting, as I'll show you how you can use existing video and even audio to drive the AI video generation.
Of course, we need to see the results! I'll show you some examples of Hunyuan Custom's output quality, from human characters and animals to complex scenes with generated backgrounds and actions. And get ready for a shootout! I'm impressed that Hunyuan isn't afraid to go head-to-head with other models like Kling, Pika, and more. We'll look at character referencing, object referencing, and multi-referencing.
The code for Hunyuan Custom should be dropping around May 9th on Hugging Face and GitHub, but I'll share a link where you can try a version of it RIGHT NOW! (Quick note: I had some ISP issues, but you should be good to go!)
Shifting gears, we'll look at Google! While Gemini 2.5 Pro is getting a lot of buzz (interactive visualizers, Godzilla vs. Gorillas simulations!), Gemini 2.0's image model quietly got an upgrade with better visual quality and text rendering. And it's free to use in AI Studio!
Finally, rounding out the freebies, Runway now allows free-tier users to access their image generator, frames, and character reference features. There are limits, but it's a great way to test things out!
CHAPTERS:
00:00 – Intro & What's Coming Up!
00:36 – Tencent's Hunyuan Custom: The Free AI Video Generator!
01:08 – Understanding Multimodal Models
01:32 - The Problem With Diffusion
01:55 - Multimodal Models
02:28 – How Hunyuan Custom Video Generation Works
03:00 - LLava
03:40 – Driving Hunyuan Custom with Video & Audio
04:29 – Hunyuan Custom: The Output Quality!
05:03 - Non Human Inputs
05:27 - Multi Character References
05:47 - Video Inpainting with Reference!
06:18 - Example 2
06:44 - A Day in the Life of A Guy
07:27 – Hunyuan Custom vs. The Competition (State-of-the-Art Shootout!)
07:45 - Example 1
08:18 - Example 2
08:56 - Example 3
09:50 - Example 4
10:49- Example 5
12:08 - Example 6
12:55- Shout out to Hunyaun!
13:11 - Driving Video Inpainting is amazing
13:47 – Try Hunyuan Custom For Yourself! (Code & Links)
14:58 – Google News!
15:55 - Google Gemini 2.0 Image Model Update
16:14 - Using AI Studio
16:37 - Generations
16:56 - What we CAN'T do in Midjourney
17:22 - One Major improvement
18:09 – Runway is Now (Kinda) Free!
19:00 – Wrapping Up!
LINKS & RESOURCES:
Try Hunyuan Custom: https://hunyuancustom.github.io/
Hugging Face (Code likely available May 9th): [https://hunyuan.tencent.com/modelSquare/home/play?modelId=192
Google AI Studio (for Gemini 2.0): https://aistudio.google.com/prompts/new_chat
My Video on Runway Frames: https://youtu.be/7jwSNb4qq_E
My Video on Runway References: https://youtu.be/umhFIUudEwo
My Video on Runway References vs. Midjourney Omni Reference: [ttps://youtu.be/Poy__YfsQNo
Don't forget to LIKE this video if you found it helpful, SUBSCRIBE for more AI news and tutorials, and hit that NOTIFICATION BELL so you don't miss out on the latest drops!
Ever wondered what happens when you pit two AI image generation giants like Midjourney and Runway against each other, especially with their new character and subject referencing updates? Well, you're in the right place. In this video, I'm not just doing an AI deathmatch (though there's a bit of that fun too), but more importantly, I'm diving deep into best practices, tips, and tricks to help YOU get the absolute most out of both platforms. Plus, we'll see what kind of magic or madness ensues when we combine their powers. Let's jump in and explore the cutting edge of AI creativity!
Big Shoutout to LTX Video for sponsoring today's video! Go check out the NEW LTX video model here: https://ltx.studio/?utm_source=social&utm_medium=influencers&utm_campaign=theoreticallymedia_ltxv13b&utm_ad_id=youtube_dedicatedvideo_02042025
What You'll Discover Inside:
A look at Midjourney's OmniReference feature – its strengths, quirks, and how to get the best results (even if it's more "solo reference" for now!).
Tips for using neutral backgrounds and optimal omni strength settings in Midjourney.
How animated, stylized, and synthetic characters shine with Omni Reference.
Tricks to combat blandness and add that "Midjourney spark" using style references.
Exploring lower omni reference rates for cool, stylized character outputs.
A dive into Runway's reference capabilities, including awesome tips from the Runway team and community for creating action sequences and title cards!
Using timestamps in Runway for creative extrapolations.
My experiments combining Midjourney's aesthetic with Runway's strong referencing for some truly unique character creations.
The potential final touch: leveraging face-swapping tools like Face Fusion for enhanced results.
Chapters:
00:00 – Intro: Midjourney & Runway Updates!
00:27 – Midjourney's Omni Reference: First Look
00:56 – Best Practices: Neutral Backgrounds & Omni Strength
01:11 - OmniReference Walkthrough
01:44 - Man in a Blue Business Suit
02:02 - Alt MiaBBS
02:20 – The "Neutral Background" Importance & Unexpected Results
03:06 – Where Omni Reference Shines: Animated & Stylized Characters
03:20 - Viking Girl 1
03:50 – Tip: Using Style References to Combat Blandness
04:16 - Viking Girl in Cyberpunk Land
04:40 – Experimenting with Lower Omni Reference Values
04:55 - Lower Value Experiments
05:14 - Impressive Results
05:35 - Swapping Out Style Ref
06:06 – Over to Runway: Reference Ideas & Tips
06:21 - LTX Video New Model
08:19 – Runway Tips: 3D Models for Action
09:08 - Title Card Generation
09:36 – Creative Uses: Timestamps & Extrapolating in Runway
10:04 – My Runway Experiments: Likeness
10:36 - Inpainting...sorta.
10:53- Overall Runway References Thoughts
11:13 – The Big Question: Can We Combine Midjourney & Runway?
11:28 - Imagine...A Long Time Ago...
11:47 - Over to Runway!
12:21 - And Back to Midjourney!
12:56 – The Final Touch: Face Swapping for Polished Results
13:41- The Final Result(s)
14:12 – Final Thoughts: Tools, Not Versus!
Resources & Links Mentioned:
My Deep Dive on Midjourney's Dash Dash EXP Command: https://youtu.be/x-Q3Y3na0YI
My Video on Runway References:https://youtu.be/umhFIUudEwo
Face Fusion 3.1 Tutorial:
https://youtu.be/3pp7qw19nuA
LTX Video: https://ltx.studio/?utm_source=social&utm_medium=influencers&utm_campaign=theoreticallymedia_ltxv13b&utm_ad_id=youtube_dedicatedvideo_02042025
Today, I'm diving into some HUGE Midjourney V7 updates that almost feel like a V7.1. Plus, I've got a first look at a seriously impressive new lip sync model you can try right now!We'll also check out a cool new feature from today's sponsor, Recraft – don't worry, it's actually pretty neat!
What's Inside This Video:
Midjourney V7's Big Update: While we're still waiting for the much-anticipated Omni Reference (aka Subject Reference or working Siri!), Midjourney dropped some significant improvements to V7. I'll break down what's new and if it fixes those 'unrealistic poses' and 'blotchy background characters'.
NEW Experimental Aesthetics Command: Meet --exp! I tested this new command that works like 'stylize' to add dynamic elements and creativity. Find out the best values to use (hint: less might be more!).
Editor Improvements: Took a quick look at the updated Midjourney editor, focusing on the texture feature which is pretty cool for visual explorations. Got editor tips? Share them below!
V7 Model Polish: Are the Gumby characters and background weirdness gone? We'll see how the updated V7 model stacks up.
Omni Reference Sneak Peek: It's delayed, but I got a glimpse of the upcoming Omni Reference feature using CEO David Holtz as the subject! Looks promising and retains that classic Midjourney aesthetic.
Tavis Hummingbird Lip Sync: Checked out the new Hummingbird lip sync model from Tavis. It's available on their site (with a free tier!) and Fall. See how it handles my head bobbles vs. an AI character!
Chapters:
00:00 – Intro: Big Midjourney Updates & New Lip Sync!
00:35 – Tracking V7 Since Launch
01:11 – V7's Initial Quirks
01:30 – The Latest V7 Update (Feels like 7.1?)
01:41 - Where is OmniReference?
02:16 – NEW Command: Experimental Aesthetics (--exp)
02:31 - What is --exp?
02:41- Toast Test!
03:10 – Testing --exp Values (Toast Experiments!)
03:37 - Experimental Toast!
04:04 - Maximum Toast!
04:22 – Finding the Sweet Spot for --exp
04:34 - v7 old prompts now with --exp!
05:32 – Midjourney Editor Updates & Texture Feature
05:57 -Edit and Retexture Tutorial
06:46 – V7 Model Improvements: Less Jank?
07:23- But is it Perfect? No it is not.
07:51 – Omni Reference SNEAK PEEK!
08:43 – Other Reference Options (Runway Recap)
08:55 - Recraft New Feature
09:14 - Recraft Infinite Style Library
11:43 – NEW Lip Sync: Tavis Hummingbird Model
11:54 -Hummingbirds at the White Lotus
12:31 - Where you can use it for FREE!
12:59 – Testing Hummingbird (Me vs. AI Character)
13:16 - Result - AI TIM is here!
13:43 - Where Hummingbird flies!
14:18 – Final Thoughts & Call for Lip Sync Models!
Resources Mentioned:
Recraft.AI: https://go.recraft.ai/theoreticallymedia_styles
use code: “MEDIA11” for $11 OFF any paid plan!!
Runway References Video: https://youtu.be/umhFIUudEwo
Tavis Lip Sync: Try Tavis Hummingbird (Free Tier!) : https://www.tavus.io/model/hummingbird
I promised to cover the rolling Midjourney updates, and while Omni Reference keeps us waiting, this V7 refresh and the new --exp command are definitely worth exploring. The Tavis Hummingbird lip sync also looks like a strong contender!What do you think of these updates? And what other lip sync models should I check out for a future shootout? Let me know in the comments below!
Don't forget to Like this video if you found it helpful, Subscribe for more AI deep dives, and hit that notification bell!
Ever struggled with keeping characters and objects consistent in your AI videos? Well, guess what? Runway just dropped a potential game-changer: the References feature! It might just be the solution we've all been waiting for.
In this video, I'm diving deep into Runway's new References tool. We'll explore:
What References does: How it uses Runway Frames to generate a first frame with your character/object reference.
Getting Started: Simple steps to use the feature (hint: don't just drag and drop!).
Putting it to the Test: See how it handles single characters, multiple characters (like our man in the blue suit and his wolf companion!), and different styles.
Tips & Tricks: Learn how to potentially avoid issues like attribute bleed and use styles effectively.
Advanced Techniques: Combining references with specific locations, using character sheets, and a sneaky trick for establishing shots.
Community Showcase: Check out awesome examples from Dave Clark, Tom Likes Robots, and Andy McNamara.
Is it perfect? Not yet. But it's a massive step forward for AI video creation and character consistency. Ready to see if Runway cooked with this one? Let's taxi down the runway and find out!
Chapters:
00:00 - Intro: The AI Video Consistency Problem Solved?
00:23 - Runway's Recent Updates (Frames, Gen4)
00:38 - Introducing the References Feature
00:51 - Reference First Release and Congrats to the 48 gang!
01:13 - References Wide Release
01:26 - How References Works (Using Frames)
02:02 - Simple Example: Man in Blue Buinesss Suit
02:37 - How to Use References Correctly
02:56 - One Character Reference
03:32 - Video Output
03:58 - Multiple References: Adding the Wolf
04:12 - Prompting the Wolf
04:37 - Video Output
05:00 - Using Styles with References
05:26- CSI K9
05:39 - Pushing Styles to Break
06:00 - Generating a Crime Movie
06:25- Challenges: Bad Outputs
07:00 - Attribute Bleed?
07:10 - The solve with a new Character
07:33 - Creating Cinematic Scenes (Master & OTS Shots)
07:52- Creating Locations to use as Reference
08:31- Video Output
08:52 - Gen4 vs. Turbo Model Comparison
09:13 - Does Generating in Frames Matter?
09:38 - Video Output
10:12 - Pro Tip: Using Character Sheets & Establishing Shots
11:00 - Sneaky Tip!
11:38 - Video Output
11:41 - Community Examples
12:44 - Final Thoughts: Did Runway Cook?
Resources Mentioned:
My Full Breakdown on Runway Frames: https://youtu.be/7jwSNb4qq_E
Let me know in the comments: What do you think of Runway's References feature? Have you tried it yet? What are you excited to create?
AI Video's ten second generation wall officially SMASHED! I'm diving into something truly game-changing today: FramePack. This open-source tool lets you generate AI videos up to (and even beyond!) ONE MINUTE long, right now, for FREE. Forget those short clips – we're talking serious length here, and it's compatible with generators like Wan, Hunan, and more.
In this video, I'll break down:
How FramePack overcomes the old drifting and coherence issues using cool tech like anti-drifting sampling.
How YOU can get it running, whether you have an Nvidia GPU (even with just 6GB VRAM!) using Pinokio, or if you're on a Mac using Hugging Face.
Step-by-step guides for both installation methods.
Tips for using the tool, including dealing with Tea Caché for better results (or maybe turning it off!).
Lots of examples, including successes and some… well, let's call them "learning experiences" (dancing girl goes exorcist, anyone?).
Limitations I found, like issues with tracking shots.
This tech is brand new and evolving fast, but it's already opening up incredible possibilities for longer-form AI video. Let's explore it together!
Chapters:
00:00 - Intro: Breaking the AI Video Length Barrier!
00:46 - Meet FramePack: The Game Changer?
01:24 - How FramePack Works (No More Drifting!)
02:01 - What questions does Framepack ask?
02:44 - Hardware Requirements (You Might Already Have It!)
03:08 - Method 1: Installing with Pinocchio (Nvidia Users)
04:18 - Method 2: Using Hugging Face (Mac & Others)
05:49 - Using Frame Pack: Settings & Tips (TeaCaché Explained)
07:27 - Generation Examples & Experiments (Hourglass Timer!)
08:36 - More Examples: Detective
08:56 - Dancing Girls
09:42 - TeaCaché On vs. Off Comparison
10:10 - Known Limitations (Tracking Shots)
10:45 - Some great use cases (Moving Pictures)
11:20 - Endless Lofi Girl
11:54 - This is not the only one: The TTT Model
Resources Mentioned:
Pinokio Installer: https://pinokio.computer/
Hugging Face Space (Duplicate This!): https://huggingface.co/spaces
FramePack Paper: https://github.com/lllyasviel/FramePack
TTT (Tom & Jerry) Paper: https://test-time-training.github.io/video-dit/
he Veo-2 PRICE BREAK we’ve been waiting for!
LTX Studio have dropped Veo-2 on their platform with a INSANE launch promo & extra Veo credits!
You’ll want to check this out!
https://ltx.studio/lp/veo-2-offer
Some WILD AI Camera Here! Start your 7-day free trial with HoneyBook and see the difference - https://bit.ly/HoneyBook-Theoretically_media
Today I'm diving deep into HiggsField, an AI video platform that's making waves with some seriously wild virtual camera movements! Ever seen those cool, hyper-stylized AI videos popping up on your feed? Chances are, they came from HiggsField.
I spent some time exploring whether this platform is just a gimmick or a genuinely useful tool for creators. Is it all just cherry-picking, or can it really deliver? Let's find out!
LINKS:
HiggsField: https://higgsfield.ai
escape Platform: https://escape.ai/
Paige Piskin : https://x.com/PaigePiskin/status/1912313290409062697
My thanks to today's sponsor: HoneyBook is more than just a CRM platform. It's more of a behind-the-scenes partner to eliminate busywork, reclaim your time, and keep you focused on what truly matters.
https://bit.ly/HoneyBook-Theoretically_media
In this video, I'll walk you through:
What is HiggsField? A look at the platform and its niche in hyper-stylized shots.
Exploring the Presets: From explosions and bullet time to the... unique... tentacle effect. (Spoiler: I get a little Lovecraftian later! )
Testing the Camera Movements: Putting presets like Zoom Out, Kiss POV (yes, really!), 360 spins, Snorri Cam, Disintegration, and Glass Shatter to the test.
Bullet Time Deep Dive: My personal favorite effect! We even recreate a Matrix scene (with a surprise Dwight Schrute lookalike!).
Mixing & Matching Effects: Combining presets like Building Explosion and Arc for dynamic shots.
Overcoming Limitations: Tackling the 5-second clip limit with the last-frame-first-frame trick and discussing potential extension solutions.
Style Consistency: Does it maintain styles like anime, or does it lean towards realism?
9:16 Format: Why Higgs Field shines for TikTok and Instagram content.
Is it Worth It? My thoughts on using Higgs Field for cinematic projects vs. social media, plus a breakdown of the pricing.
Chapters:
00:00 - Intro: Wild Camera Moves!
00:44 - Exploring Higgs Field Presets
01:34 - Useful vs. Gimmicky Effects
02:00 - Testing: Noir Detective Zoom Out
02:35 - Zoom and Arc
03:15 - Michael Bay Level Explosions!
03:54 - The Virtual Smooch: Kiss Preset
04:46 - 360 Spins
05:05 - Snorri Cam Showcase
05:58 - Disintegration
06:22: Tentacles (Lovecraft Time!)
07:08 - HoneyBook
08:52 - Bullet Time Fun (Matrix BTS!)
09:48 - Side Note: John Gaeta & escape.io
10:24 - Cool Effect: Glass Shatter
10:49 - Mixing Presets: Explosions + Arc
11:41 - Extending Clips: The 5-Second Limit
12:19 - Last First Frame
13:15 - Style Consistency
13:51- 9:16, where this flies
14:10 - But still good at 16:9
14:35 - Pricing & Final Thoughts
HiggsField offers some genuinely impressive (and sometimes hilarious) tools for creating eye-catching AI video content, especially for shorter formats. While the 5-second limit is a hurdle, the unique camera movements and effects make it a platform worth exploring, even if just for adding some 'zazz' to your projects.
What do you think? Is Higgs Field a useful tool or just a fun gimmick? Let me know in the comments below! 👇
Don't forget to LIKE 👍 if you found this helpful and SUBSCRIBE for more AI video explorations!
Kling 2.0 just dropped, and you know I had to dive right in to see if the reigning champ of AI video generation got even better. In this video, I'm putting Kling 2.0 through its paces, comparing it to the previous 1.6 model (which was already crushing it!) and seeing how it stacks up against the competition like Google's VEO 2. We'll look at text-to-video, image-to-video, new features, and find out what's still missing and what's coming next. Let's see if Kling keeps the crown! 👑
Chapters:
00:00 - Intro: Kling 2.0 Arrives!
00:41 - First Look: Fidelity & Coherence Boost
01:03 - Text-to-Video: Game of Thrones Direwolves (and Real Life!)
01:26 - Direwolf Results
01:59 - Text To Video Test 2
02:13 - Image-to-Video Quick Test (vs. VEO 2 Style)
02:59 - Overall Text To Video Impressions
03:15 - Text-to-Video: Strengths & Weaknesses (Cowboy Cosplay?!)
03:43 - Coherent Fast Motion: Kung Fu Fight Scenes!
04:41 - More Examples (Thanks Justin!): Punching Bags & Blizzards
05:19 - Running Test
05:50 - Generation Specs & Camera Control
06:09 - Lens and Camera Motion Callouts
07:14 - Image-to-Video Deep Dive (ft. Kolors 2.0)
07:50 - Sci-Fi Vibe Short
08:32 - Saucing Up the Footage
09:02 - Less Generations
09:22 - Impressive Walking Cycles
09:58 - Stylistic Retention with Image to Video
10:49 - AI Girlfriends and Crusty Pirates
11:45 - Final Frame Trick
12:26 - NEW Multi-Elements Feature
13:55 - Final Verdict: Still the King?
What I Cover:
Comparing Kling 2.0 vs 1.6 and Google VEO 2.
Testing Text-to-Video: Examples range from direwolves and stock footage to kung fu and slightly awkward cowboys.
Exploring Image-to-Video: Checking consistency, character animation, style adherence, and complex motion like walking and fighting.
Camera Control & Lens Simulation: Does asking for an "85mm lens" work?
New Multi-Elements Feature: A quick look at swapping elements between videos (using the 1.6 model for now!).
Creating a short sci-fi piece using Kling 2.0, Udio for music, and Topaz for upscaling.
What do YOU think of Kling 2.0? Is it still the best AI video generator out there, or are you using something else? Let me know down in the comments! 👇
Don't forget to LIKE 👍 if you found this helpful and SUBSCRIBE for more AI explorations!
Thanks for watching!
Tim!
Google's Cloud Next event just wrapped, and wow, did they unload a ton of AI news! It felt a bit like a shotgun blast of announcements, but I'm honing in on the cool creative AI stuff – the latest on Veo 2, Imagen 3, and a brand new AI music generator called Lyra. Did Google cook? Yeah, they did! But can you get a plate easily? Well... maybe it's more like the drive-thru for now.
In this video, I break down the biggest creative AI takeaways from Google Cloud Next:
What's New: We got updates on the already impressive Veo 2 text-to-video, the next iteration of their image generator (Imagen 3), Chirp 3 for text-to-speech, and the new Lyra AI music tool. Google's flexing that they're the only platform doing video, music, image, and speech natively... sort of.
Veo 2 Power-Ups: I dive into the new Veo 2 features like inpainting (think magic eraser for video!), aspect ratio changes, camera controls, and extending video length. Plus, a cool look at how Veo 2 helped bring The Wizard of Oz to the Sphere!
Imagen 3 & Chirp 3: Imagen 3 gets better detail and even image editing capabilities now. Chirp 3 lets you create custom voices with just seconds of audio.
Lyra Music: Google's answer to AI music? It generates short, instrumental jingles – maybe not dropping fire beats just yet, but useful for content creators.
Vertex AI Access: Here's the catch – most of this cool stuff lives on Google's Vertex AI platform, which looks enterprise-y but is actually accessible.
FREE AI Fun?: The best part? I show you how you might be able to test drive Imagen 3 and the powerful base Veo 2 on Vertex AI for FREE right now using their trial credits (seriously, my cost is $0 so far!). No guarantees on how long this 'open cookie jar' lasts, though!
Gemini 1.5 Pro: Also, Gemini 1.5 Pro is widely available and has honestly become my daily driver LLM. I even show a neat trick using it with Veo 2.
GOOGLE VERTEX: https://cloud.google.com/vertex-ai?hl=en
Chapters:
00:00 - Intro: Google Cloud Next AI Overload!
01:06 - Veo 2 & The Wizard of Oz Sphere Project
02:13 - Gemini 1.5 Pro Goes Wide
02:32 - Creative AI Suite: Veo 2, Imagen 3, Chirp 3, Lyra
03:19 - Lyra AI Music Generator: First Look
04:17 - Imagen 3 Gets Updates & Editing
04:38 - Chirp 3: Custom Text-to-Speech
04:50 - Veo 2 NEW Features: Inpainting, Camera Controls & More!
05:30 - Inpainting in Veo-2
05:52 - Expanding the Frame in Veo-2
06:21 - Camera Controls in Veo-2
06:40 - First Frame Last Frame
06:57 - Extend Video in Veo-2
07:27 - Editing Toolbar in Veo-2
07:43 - Accessing Google's AI: The Vertex Platform
08:48 - How to Use Vertex AI (FREE Veo 2/Imagen 3?!)
10:43 - Cool Trick: Gemini + Veo 2 Prompts
12:01 - Final Thoughts: Grab the Free AI While You Can!
What do you think about these Google AI updates? Are you going to try accessing Veo 2 or Imagen 3 on Vertex AI? Let me know in the comments below!
Don't forget to LIKE this video if you found it helpful and SUBSCRIBE for more AI adventures!
Microsoft just dropped an AI bomb on an Id Software classic! They've unleashed an AI-powered version of the legendary Quake II, and YES, you can play it RIGHT NOW! But should you? 🤔 Let's dive into this wild experiment, ask the big "why," and try to figure out where this whole AI gaming thing is headed. Grab your Railgun and some Quad Damage – it's gonna be a trip!
What's Going On with AI Quake II?!
This isn't your daddy's Quake (unless your dad is super into cutting-edge AI!). Microsoft Research, the wizards behind the curtain, have unleashed "Muse," their AI world agent, to recreate Quake II. Following in the footsteps of Google's AI Doom, this project is seriously mind-bending. But hold on a sec... is it actually Quake II? Let's break it down!
Chapters:
00:00 – The AI Quake II Drop! - Microsoft's surprise release and initial thoughts.
00:23 – What Exactly IS AI Quake II? - Diving into the tech and its predecessors like Google's AI Doom.
00:45 – A Quick Quake II History Lesson - A blast from the past on the original game's impact.
01:15 – From Xbox to AI: The Evolution - Tracing Quake II's journey to this AI iteration.
01:56 – Meet Muse: The AI Behind the Magic - Exploring Microsoft's AI world agent and its upgrades.
02:21 – Tech Specs & What This Isn't - Clearing up confusion: it's research, not a polished game!
02:52 – AI-Generated Worlds: The Core Concept - Understanding how this interactive AI video works.
03:10 – Muse's Level Up: Faster and Bigger! - Examining the improvements in frame rate and resolution.
03:42 – Training Time: From Years to a Week?! - The surprisingly short training period for this AI model.
04:12 – The Big Question: Is It a Good Game? - Spoiler alert: not quite yet!
04:21 – The Janky Reality: Object Permanence & More - Facing the current limitations of AI game generation.
04:56 – Hands-On (Kind Of): My Hilarious Gameplay Experience - Trying out the keyboard-only controls and weird glitches!
06:05 – Game Over (Literally): The Short Play Session - Highlighting the limited playtime.
06:15 – Why Does This Exist?! The Big Picture - Exploring Microsoft's vision for AI and game preservation.
07:00 – The Future of AI in Gaming: My Thoughts - Reflecting on the progress and what's to come.
07:56 – The "Can It Run Crisis?" of AI Games! - My proposal for a new AI benchmark.
LINK: Previous Video On MUSE:
https://youtu.be/wbfhdq7f7jA
Play AI QUAKE 2: https://copilot.microsoft.com/wham?features=labs-wham-enabled
The Nitty-Gritty (Tech Specs & More):
This AI Quake II is a research project by Microsoft's "Muse" (formerly WHAM!).
It generates interactive video at roughly 10 frames per second (up from 1!).
The output resolution is 640x360 (doubled from the previous model).
The Quake II model was trained on about one week of Quake II video.
Controls are currently limited to keyboard input (WASD for movement, arrow keys to look, F to fire).
Object permanence is a major challenge – if you look away for less than a second, things disappear!
Health values and other in-game counters aren't super reliable.
Gameplay sessions are currently limited in duration.
Why Bother with AI Quake II?
While it's not exactly replacing your gaming rig anytime soon, this project hints at a fascinating future. Phil Spencer from Microsoft Gaming has talked about using AI to learn old games and make them playable on any platform, regardless of the original hardware. Think game preservation on a whole new level!
The Future is Now (Kind Of):
We're seeing rapid advancements in generative AI for games, from AI-powered NPCs to 3D asset creation. While a fully AI-generated AAA title is still a ways off, projects like Muse and Google's Genie are laying the groundwork. Could this be the beginning of the "ChatGPT moment" for video games? I think we might just be seeing the very first sparks
Midjourney V7 is finally here, and I'm diving deep! I'm testing out the new features, including the personalization profile and the super-fast draft mode. Is it worth the hype? Let's find out!
In this video, I'll cover:
My first impressions of Midjourney V7.
The new personalization profile setup.
A look at the draft mode and its speed.
Prompt tests and image quality comparisons.
What's missing and what's coming next.
If you're into AI art, you won't want to miss this! Hit like, subscribe, and share your V7 thoughts below!
Chapters:
00:00 – Intro: Midjourney V7 Arrival!
00:24 – Just How Long Has It Been?
00:55 - The Landscape Has Changed
01:32 – Key Features Overview
01:45 - Getting Started With Personalize
02:35 - Options Toolbar Changes
03:10 – Prompt Testing Toast
03:46 - Prompt Test Beauty
04:11 - Prompt Testing Surreal
05:03- Prompt Testing Wizards
06:09 - Hard Prompt Test
06:44 - New Feature: Draft Mode
07:32 - Draft Mode Test
07:55- Missing Features
08:13 - More Hard Prompts
08:45 - Blurry Faces
09:04 - Impressions
09:36 - What is Coming Up for Midjourney?
10:04 - Midjourney Video
10:42 - Is Midjourney the Best Image Generator Around?
10:53 – Does "the best" matter?
11:08 - Coming up Next
Today, we're diving deep into RunwayML's Gen 4, the latest update to their AI video generation model. Runway claims that Gen 4 boasts improved fidelity and output quality. More importantly, Gen 4 introduces better consistency in character, location, and color grading.
While the highly anticipated reference feature isn't available at launch, there's still plenty to explore. I'm putting Gen 4 through its paces with a series of tests, checking out community outputs, and sharing some best practices.
We'll take a look at everything from image to video generation, character consistency, and even some fun experiments with text and effects
I’ll also be comparing Gen 4 to previous Runway versions like Gen 3 and Gen 2.
Runway's Gen 4 is a significant step forward, and I'm excited to share my findings with you. There are still more updates on the way. So, buckle up, and let's jump into the world of Gen 4!
RunwayML: https://runwayml.com/
Runway's Frames Video: https://youtu.be/7jwSNb4qq_E
00:00 - Intro to Runway Gen 4
00:29 - Gen 4 Overview and Key Features
01:17 - Gen 4 at Launch
01:48 - Frames to Video (Man In the Blue Suit)
01:59 - First Video Test
02:24 - Frames Image Generation
02:44 - London Test
02:58 - Gen 4 vs. Gen 3 vs. Gen 2
04:02 - Gen-4 Text (Woman in the Red Dress)
05:13 - Impressive Wide Angle Shot
05:48 - Gen 4 Wonk
06:29 - Expand Feature With Gen-4
07:00 - Fight Sequences (Raging Bull)
07:32- Noir Test & What Reference Might REALLY Be
08:15 - Noir Test 2
08:42 - How Powerful Will Reference Be?
09:55 - Image to Video
10:29 - No Prompt with Image To Video Test
11:07 - Working with Multiple Faces
12:13 - Grandma with a Flamethrower
12:21 Style Consistency with Image to Video
12:52 - Cyberpunk Woman
13:55 - Gen-4 To Restylize
13:59 - Community Outputs Showcase
15:58 - Final Thoughts on Gen 4
Let's take a tour of one of the most interesting AI generation platforms I've come across: Flora. If you're into AI art, image editing with Gemini, and even video generation, this platform has it all.
I know the workflow might look a bit intimidating at first – all those Blocks and connections! But trust me, it's not as scary as it seems. I'll walk you through the basics, from generating images with text prompts to using cool features like style reference and video-to-video generation.
We'll be playing around with Gemini's editing features, creating our own character references, and even training a custom style. Plus, I’ll show you how to take those images and turn them into videos!
Whether you're into clean, organized workflows or a bit of creative chaos, Flora has something for you. And the possibilities? Let’s just say they got me thinking about fight sequences and character turnarounds!
Check out Flora for yourself! They offer 2000 free monthly credits so you can explore the platform and see if it’s a good fit for you.
LINKS:
FLORA: https://tinyurl.com/floratim
CHAPTERS
00:00 - Intro to Flora: An AI Playground
00:28 - The AI Murderboard Isn't Scary
00:52 - Understanding the Workflow
01:19 - Text to Image Generation with LLMs
01:55 - Image Generation Options
02:42 - Video Generation Options
03:21 - Organization on a Canvas
03:53 - Gemini's Image Editing Features
04:46 - Flora Styles
05:25- Cinematic Crime Film Style
06:30 - Training Your Own Style
07:33 - Testing Out a Custom Style
08:27 - Video Generation Tests
08:57 - Off Book Video to Video Generation
09:50 - Character Edits With Gemini
11:26 - Impressive Turnarounds!
12:37 - Thoughts on Turnarounds Use Cases
12:57 - Advanced Flora Technique
13:25 - Storyboarding with AI
13:54- Flora's Free and Paid Plans
14:12 - Next Up for Flora
Hey everyone! I'm super excited to finally share a full production breakdown of my AI short film, "The Bridge." The response has been incredible (almost 400,000 views across platforms!) and I'm here to spill all the secrets on how I made it. In this video, I’ll walk you through the entire process, from pre-production and AI tools, all the way to final post-production. I’ll also dive into the costs and compare them to traditional filmmaking.
If you haven’t seen the short yet, don’t worry, I’ve included it in the video!
My Thanks to Recraft for Sponsoring Today's Video! Check out Recraft – https://go.recraft.ai/media. Use my code MEDIA for $12 off any paid plan.
LINKS:
Jason Zeda's Wu-Tang Clan Video: https://youtu.be/ZBTb_xJBh5c?si=kqZPfc0h30F2oTBN
Henry’s Prompt Tweet: https://x.com/henrydaubrez/status/1894513057109348405
Google Labs Discord: https://discord.gg/googlelabs
Topaz Upscale: https://topazlabs.com/ref/2518/
ElevenLabs: https://try.elevenlabs.io/w5183uo8pydc
Hume: https://www.hume.ai/
Hedra: https://www.hedra.com/
CHAPTERS
00:00 – Introduction
00:37 - The Bridge Views
01:20 - The Bridge
03:37 – Pre-Production
04:11 – AI Tool Selection
05:10 - Re-Inspired
05:31 – Prompt Engineering
06:13- Working with an LLM on a Film
06:33 – Veo-2's Super Power
07:50 – Achieving Visual Consistency
09:10 – Audio Production
09:52 – Lip Syncing with Hedra
10:32 - Upscaling
11:16 – More Upscaling
11:52 - Making a Poster w/ Recraft
13:23 – Post-Production in Premiere Pro
13:55 - Death's Voice
14:40 – Is it Perfect?
15:25 - Cost Breakdown
17:01 - Comparison To Traditional Filmmaking
18:11 – Make Movies
18:55 - It's just not there yet
OpenAI has dropped a brand NEW AI Image generation model, and it's NOT Dall-E 4! In this video, I'm taking you on a deep dive into this remarkable AI Model and showing you EVERYTHING it can do. We'll explore its impressive capabilities, uncover some hidden features, and yes, even talk about its quirks. Is this the end for avocado chairs?!
In this video, I cover:
* OpenAI's new AI image generator (it's not Dall-E!)
* Image generation AI tools and comparisons
* AI art techniques and workflows
* Text-to-image AI generation
* AI character design and consistency
* Sora AI video generation updates
* AI art community highlights
* New AI tools and software
This new AI is seriously impressive, but it also has some limitations. I'll show you what I've discovered in my testing, including how it handles text, image referencing, and creating consistent characters. Plus, we'll take a look at the latest Sora updates and some incredible AI art from the community.
If you're interested in AI art, image generation, or just the latest in AI tech, you NEED to watch this!
LINKS:
THE BRIDGE: https://youtu.be/YDlME4qvER8
OpenAI: https://openai.com/
Reve: https://preview.reve.art/app
Ideogram: https://ideogram.ai/
#AI #ImageGeneration #OpenAI #Dalle #Dalle3 #Sora #ArtificialIntelligence #TechReview #AIArt #Stablediffusion #Midjourney #AItools #TexttoImage #CreativeAI #aimusic
Chapters:
00:00 - Intro: OpenAI's New Image Generator - Is Dall-E Dead?
00:18 - Quick Announcement: My AI Short Film - The Bridge
00:41 - Goodbye Dall-E? First Look at the NEW AI Model
01:28 - Image Generation Tests: Blue Suit Guy, Samurai & More!
02:11- Controls and Options
03:04 - Samurai Test
03:15 - Clown with a chainsaw
03:45 - AI Challenge: Complex Prompts with the Woman in the Red Dress
04:17 - Remixing for Different Angles
04:49 - Creative AI: Underwater Scenes, 90s Nostalgia & VHS Tapes
05:24 - The VHS Tape
05:48 - Text Generation is INSANE! Alan Wake Novelization Test
06:56 - GTA 7 Box Art Test
07:52 - Image Referencing EXPLAINED: How to Use Reference Images
08:04 - Scrambling Faces
08:26 - Not John Wick
09:04 - Multiple Image References
10:24 - Illustrated to Photo with Image Reference
11:13 - Sora Time
11:48 - Community Outputs
13:04 - More AI Tools You NEED to Know: Reve, Ideogram 3.0
Presenting: The Bridge. An AI Short film utilizing Google’s Veo-2. I’m really proud of this one, as my goal (as always) is to push storytelling, performance, and narrative in this emerging art form, and I feel like I managed to pull something close that off here.
Every shot here utilized Veo-2, although there were a few post-generation tricks here and there.
I’ll cover everything in detail in an upcoming video. In the meantime, I hope you enjoy the short.
Adobe AI & Flux! Build your own web app with Hostinger Horizons: http://hostinger.com/tmedia & use Code TMEDIA for 10% off!
In this video, I dive deep into some HUGE updates in the world of AI image and video generation! Adobe is surprisingly opening up its ecosystem to third-party models like Black Forest Labs (Flux), Google's Imagen 3, and RunwayML's Frames – right inside Adobe Express and Project Concept! I explain what this means for creators, especially the ability to finally "get dirty" with models beyond Firefly. Plus, I'm testing out Stability AI's brand new FREE virtual camera tool and showing you how it could revolutionize how we control camera angles in generated content. I even use it in conjunction with google's Gemini and Runway!
Key Topics:
Adobe's surprising partnerships with external AI model providers.
What Flux, Imagen 3, and Runway Frames bring to the Adobe ecosystem.
The implications for Photoshop and Premiere Pro (and what about Sora?).
Hands-on with Stability AI's Stable Virtual Camera.
Combining Stable Virtual Camera with Gemini 2.0 and Runway Gen 3 for enhanced control.
Project Concept Beta Waitlist: https://concept.adobe.com/discoverStability AI Virtual Camera (Hugging Face): https://huggingface.co/spaces/stabilityai/stable-virtual-camera
Previous Video (Gemini 2.0): https://youtu.be/llvyFBTyiGs
Chapters
00:00 - Intro: Adobe Enters the Black Forest!
00:25 Adobe Embraces Third-Party AI Models
01:18 My Experience at Adobe's AI Summit
01:45 Adobe's New Partnerships: Flux, Imagen 3, Runway Frames
02:22 Where These Models Will Appear First
03:14 The Interesting Inclusion of Video: Veo2
04:04 - Potential Cost Savings with Integration
04:26 - Hostinger Horizon AI Web App Builder
08:27 Stability AI's Stable Virtual Camera
08:28 How to Use Stable Virtual Camera (Hugging Face Demo)
10:36 Limitations and Research Preview Status
10:48 Combining with Gemini 2.0 and Runway Gen 3
11:15 The Future of Camera and Subject Control
11:48 - A real look at what is coming
Is Google's Gemini 2.0 about to revolutionize image generation and editing? In this video, I'm diving deep into Google's latest AI release and explore its powerful capabilities. Some are calling it a Photoshop killer, but is it really? We'll break down what Gemini 2.0 CAN do, its limitations, and how you can start using it for FREE right now!
We'll cover:
Image generation quality and prompt coherence
Editing existing images (even Midjourney!)
Creating image sequences for AI video
Generating consistent characters for AI training (Laura)
Cool community examples & use cases!
Sneak peek at upcoming video generation features!
Whether you're an AI enthusiast, content creator, or just curious about the future of image technology, this video is for you!
👇 Links & Resources:
Google AI Studio: https://aistudio.google.com/prompts/new_chat
👍 Like, Subscribe, and hit the notification bell for more AI content!
Community Outputs
Bilawal Sidhu: https://x.com/bilawalsidhu
Victor M: https://x.com/victormustar
Min Choi: https://x.com/minchoi
Aiba Keiworks: https://x.com/AibaKeiworks
Umesh: https://x.com/umesh_ai
#Gemini2.0 #AIImageGeneration #AIArt #GoogleAI #Midjourney #Dalle3 #AITools #FreeAI #PhotoshopKiller #ArtificialIntelligence
CHAPTERS
0:00 - Intro
0:31 - Gemini Goes Multimodal
0:45 - This is What OpenAI implied
1:33 - How To Access AI Studio
1:55 - Rate Limits
2:15 - Image Tests With The Man In a Blue Business Suit
3:20 - Generating Video With Luma Labs
3:38 - Cinematic Angles with Midjourney Images
04:31 - Image Fidelity Loss and How To Overcome
05:17 - Things can still be wonky
05:30 - Tips and Tricks With Rerolling
06:19 - Using Real Photos
6:37 - Using This For Video Keyframes
7:14 - Using 3 keyframes in Runway
7:35 - Speedramping
8:02 - The Gamechanger for LoRAs
9:02 - Community Outputs
10:27 - Video is Coming!