Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
AnteayerSalida Principal

EMO: Alibaba’s Diffusion Model-Based Talking Portrait Generator

Por: Maya Posch
10 Junio 2024 at 23:00

Alibaba’s EMO (or Emote Portrait Alive) framework is a recent entry in a series of attempts to generate a talking head using existing audio (spoken word or vocal audio) and a reference portrait image as inputs. At its core it uses a diffusion model that is trained on 250 hours of video footage and over 150 million images. But unlike previous attempts, it adds what the researchers call a speed controller and a face region controller. These serve to stabilize the generated frames, along with an additional module to stop the diffusion model from outputting frames that feature a result too distinct from the reference image used as input.

In the related paper by [Linrui Tian] and colleagues a number of comparisons are shown between EMO and other frameworks, claiming significant improvements over these. A number of examples of talking and singing heads generated using this framework are provided by the researchers, which gives some idea of what are probably the ‘best case’ outputs. With some examples, like [Leslie Cheung Kwok Wing] singing ‘Unconditional‘ big glitches are obvious and there’s a definite mismatch between the vocal track and facial motions. Despite this, it’s quite impressive, especially with fairly realistic movement of the head including blinking of the eyes.

Meanwhile some seem extremely impressed, such as in a recent video by [Matthew Berman] on EMO where he states that Alibaba releasing this framework to the public might be ‘too dangerous’. The level-headed folks over at PetaPixel however also note the obvious visual imperfections that are a dead give-away for this kind of generative technology. Much like other diffusion model-based generators, it would seem that EMO is still very much stuck in the uncanny valley, with no clear path to becoming a real human yet.

Thanks to [Daniel Starr] for the tip.

A Wireless Monitor Without Breaking The Bank

Por: Jenny List
9 Junio 2024 at 14:00

The quality of available video production equipment has increased hugely as digital video and then high-definition equipment have entered the market. But there are still some components which are expensive, one of which is a decent quality HD wireless monitor. Along comes [FuzzyLogic] with a solution, in the form of an external monitor for a laptop, driven by a wireless HDMI extender.

In one sense this project involves plugging in a series of components and simply using them for their intended purpose, however it’s more than that in that it involves some rather useful 3D printed parts to make a truly portable wireless monitor, as well as saving the rest of us the gamble of buying wireless HDMI extender without knowing whether it would deliver.

He initially tried an HDMI-to-USB dongle and a streaming Raspberry Pi, however the latency was far too high to be useful. The extender does have a small delay, but not so bad as to be unusable. The whole including the monitor can be powered from a large USB power bank, answering one of our questions. All the files can be downloaded from Printables should you wish to follow the same path, and meanwhile there’s a video with the details below the break.

What If

9 Junio 2024 at 08:00

We’ve noticed a recent YouTube trend of producing trailers for shows and movies as if they were produced in the 1950s, even when they weren’t. The results are impressive and, as you might expect, leverage AI generation tools. While we enjoy watching them, we were especially interested in [Patrick Gibney’s] peek behind the curtain of how he makes them, as you can see below. If you want to see an example of the result first, check out the second video, showing a 1950s-era The Matrix.

Of course, you could do some of it yourself, but if you want the full AI experience, [Patrick] suggests using ChatGPT to produce a script, though he admits that if he did that, he would tweak the results. Other AI tools create the pictures used and the announcer-style narration. Another tool produces cinematographic shots that include the motion of the “actors” and other things in the scene. More tools create the background music.

Once you have all that, it is straightforward to edit it together as a video. If you want to try your hand, many of the tools have some free tier, although you might not be able to do everything you want in one shot with free tools. [Patrick] reports he spends about $70 a month to get full access to the tools he uses, but he also mentions some other alternatives.

You have to wonder how long it will be before you can just get an AI filmmaker tool that does the whole thing in one swoop. However, doing it in pieces like this does give you a bit more control. In particular, we were interested that some of the “secret sauce” was using negative prompts to prevent certain behaviors in certain tools.

We were hoping [Patrick] would send up Star Trek, but for that, we had to check out [Rafa Reels]. Of course, you don’t have to limit yourself to the 1950s. For example, [Patrick] also wondered what it would be like if Star Wars were made in the 1990s with [Sir Sean Connery] as [Obi Wan]. Thanks to him, you don’t have to wonder.

Interfacing a Cheap HDMI Switch With Home Assistant

7 Junio 2024 at 11:00
Close-up of the mod installed into the HDMI switch, tapping the IR receiver

You know the feeling of having just created a perfect setup for your hacker lab? Sometimes, there’s just this missing piece in the puzzle that requires you to do a small hack, and those are the most tempting. [maxime borges] has such a perfect setup that involves a HDMI 4:2 switch, and he brings us a write-up on integrating that HDMI switch into Home Assistant through emulating an infrared receiver’s signals.

overview picture of the HDMI switch, with the mod installed

The HDMI switch is equipped with an infrared sensor as the only means of controlling it, so naturally, that was the path chosen for interfacing the ESP32 put inside the switch. Fortunately, Home Assistant provides the means to both receive and output IR signals, so after capturing all the codes produced by the IR remote, parsing their meaning, then turning them into a Home Assistant configuration, [maxime] got HDMI input switching to happen from the comfort of his phone.

We get the Home Assistant config snippets right there in the blog post — if you’ve been looking for a HDMI switch for your hacker lair, now you have one model to look out for in particular. Of course, you could roll your own HDMI switch, and if you’re looking for references, we’ve covered a good few hacks doing that as part of building a KVM.

❌
❌