Vista de Lectura

Hay nuevos artículos disponibles. Pincha para refrescar la página.

An Animated Walkthrough of How Large Language Models Work

If you wonder how Large Language Models (LLMs) work and aren’t afraid of getting a bit technical, don’t miss [Brendan Bycroft]’s LLM Visualization. It is an interactively-animated step-by-step walk-through of a GPT large language model complete with animated and interactive 3D block diagram of everything going on under the hood. Check it out!

nano-gpt has only around 85,000 parameters, but the operating principles are all the same as for larger models.

The demonstration walks through a simple task and shows every step. The task is this: using the nano-gpt model, take a sequence of six letters and put them into alphabetical order.

A GPT model is a highly complex prediction engine, so the whole process begins with tokenizing the input (breaking up words and assigning numerical values to the chunks) and ends with choosing an appropriate output from a list of probabilities. There are of course many more steps in between, and different ways to adjust the model’s behavior. All of these are made quite clear by [Brendan]’s process breakdown.

We’ve previously covered how LLMs work, explained without math which eschews gritty technical details in favor of focusing on functionality, but it’s also nice to see an approach like this one, which embraces the technical elements of exactly what is going on.

We’ve also seen a much higher-level peek at how a modern AI model like Anthropic’s Claude works when it processes requests, extracting human-understandable concepts that illustrate what’s going on under the hood.

Power Supply With Benchtop Features Fits In Your Pocket

[CentyLab]’s PocketPD isn’t just adorably tiny — it also boasts some pretty useful features. It offers a lightweight way to get a precisely adjustable output of 0 to 20 V at up to 5 A with banana jack output, integrating a rotary encoder and OLED display for ease of use.

PocketPD leverages USB-C Power Delivery (PD), a technology with capabilities our own [Arya Voronova] has summarized nicely. In particular, PocketPD makes use of the Programmable Power Supply (PPS) functionality to precisely set and control voltage and current. Doing this does require a compatible USB-C charger or power bank, but that’s not too big of an ask these days.

Even if an attached charger doesn’t support PPS, PocketPD can still be useful. The device interrogates the attached charger on every bootup, and displays available options. By default PocketPD selects the first available 5 V output mode with chargers that don’t support PPS.

The latest hardware version is still in development and the GitHub repository has all the firmware, which is aimed at making it easy to modify or customize. Interested in some hardware? There’s a pre-launch crowdfunding campaign you can watch.

AI Face Anonymizer Masks Human Identity in Images

We’re all pretty familiar with AI’s ability to create realistic-looking images of people that don’t exist, but here’s an unusual implementation of using that technology for a different purpose: masking people’s identity without altering the substance of the image itself. The result is the photo’s content and “purpose” (for lack of a better term) of the image remains unchanged, while at the same time becoming impossible to identify the actual person in it. This invites some interesting privacy-related applications.

Originals on left, anonymized versions on the right. The substance of the images has not changed.

The paper for Face Anonymization Made Simple has all the details, but the method boils down to using diffusion models to take an input image, automatically pick out identity-related features, and alter them in a way that looks more or less natural. For this purpose, identity-related features essentially means key parts of a human face. Other elements of the photo (background, expression, pose, clothing) are left unchanged. As a concept it’s been explored before, but researchers show that this versatile method is both simpler and better-performing than others.

Diffusion models are the essence of AI image generators like Stable Diffusion. The fact that they can be run locally on personal hardware has opened the doors to all kinds of interesting experimentation, like this haunted mirror and other interactive experiments. Forget tweaking dull sliders like “brightness” and “contrast” for an image. How about altering the level of “moss”, “fire”, or “cookie” instead?

The Constant Monitoring and Work That Goes into JWST’s Optics

The James Webb Space Telescope’s array of eighteen hexagonal mirrors went through an intricate (and lengthy) alignment and calibration process before it could begin its mission — but the process is far from being a one-and-done. Keeping the telescope aligned and performing optimally requires constant work from its own team dedicated to the purpose.

Alignment of the optical elements in JWST are so fine, and the tool is so sensitive, that even small temperature variations have an effect on results. For about twenty minutes every other day, the monitoring program uses a set of lenses that intentionally de-focus images of stars by a known amount. These distortions contain measurable features that the team uses to build a profile of changes over time. Each of the mirror segments is also checked by being imaged selfie-style every three months.

This work and maintenance plan pays off. The team has made over 25 corrections since its mission began, and JWST’s optics continue to exceed specifications. The increased performance has direct payoffs in that better data can be gathered from faint celestial objects.

JWST was fantastically ambitious and is extremely successful, and as a science instrument it is jam-packed with amazing bits, not least of which are the actuators responsible for adjusting the mirrors.

Here’s Code for that AI-Generated Minecraft Clone

A little while ago Oasis was showcased on social media, billing itself as the world’s first playable “AI video game” that responds to complex user input in real-time. Code is available on GitHub for a down-scaled local version if you’d like to take a look. There’s a bit more detail and background in the accompanying project write-up, which talks about both the potential as well as the numerous limitations.

We suspect the focus on supporting complex user input (such as mouse look and an item inventory) is what the creators feel distinguishes it meaningfully from AI-generated DOOM. The latter was a concept that demonstrated AI image generators could (kinda) function as real-time game engines.

Image generators are, in a sense, prediction machines. The idea is that by providing a trained model with a short history of what just happened plus the user’s input as context, it can generate a pretty usable prediction of what should happen next, and do it quickly enough to be interactive. Run that in a loop, and you get some pretty impressive clips to put on social media.

It is a neat idea, and we certainly applaud the creativity of bending an image generator to this kind of application, but we can’t help but really notice the limitations. Sit and stare at something, or walk through dark or repetitive areas, and the system loses its grip and things rapidly go in a downward spiral we can only describe as “dreamily broken”.

It may be more a demonstration of a concept than a properly functioning game, but it’s still a very clever way to leverage image generation technology. Although, if you’d prefer AI to keep the game itself untouched take a look at neural networks trained to use the DOOM level creator tools.

Nix + Automated Fuzz Testing Finds Bug in PDF Parser

[Michael Lynch]’s adventures in configuring Nix to automate fuzz testing is a lot of things all rolled into one. It’s not only a primer on fuzz testing (a method of finding bugs) but it’s also a how-to on automating the setup using Nix (which is a lot of things, including a kind of package manager) as well as useful info on effectively automating software processes.

[Michael] not only walks through how he got it all up and running in a simplified and usefully-portable way, but he actually found a buffer overflow in pdftotext in the process! (Turns out someone else had reported the same bug a few weeks before he found it, but it demonstrates everything regardless.)

[Michael] chose fuzz testing because using it to find security vulnerabilities is conceptually simple, actually doing it tends to require setting up a test environment with a complex workflow and a lot of dependencies. The result has a high degree of task specificity, and isn’t very portable or reusable. Nix allowed him to really simplify the process while also making it more adaptable. Be sure to check out part two, which goes into detail about how exactly one goes from discovering an input that crashes a program to tracking down (and patching) the reason it happened.

Making fuzz testing easier (and in a sense, cheaper) is something people have been interested in for a long time, even going so far as to see whether pressing a stack of single-board computers into service as dedicated fuzz testers made economic sense.

Split-Flap Clock Flutters Its Way to Displaying Time Without Numbers

Here’s a design for a split-flap clock that doesn’t do it the usual way. Instead of the flaps showing numbers , Klapklok has a bit more in common with flip-dot displays.

Klapklok updates every 2.5 minutes.

It’s an art piece that uses custom-made split-flaps which flutter away to update the display as time passes. An array of vertically-mounted flaps creates a sort of low-res display, emulating an analog clock. These are no ordinary actuators, either. The visual contrast and cleanliness of the mechanism is fantastic, and the sound they make is less of a chatter and more of a whisper.

The sound the flaps create and the sight of the high-contrast flaps in motion are intended to be a relaxing and calming way to connect with the concept of time passing. There’s some interactivity built in as well, as the Klapklok also allows one to simply draw on it wirelessly with via a mobile phone.

Klapklok has a total of 69 elements which are all handmade. We imagine there was really no other way to get exactly what the designer had in mind; something many of us can relate to.

Split-flap mechanisms are wonderful for a number of reasons, and if you’re considering making your own be sure to check out this easy and modular DIY reference design before you go about re-inventing the wheel. On the other hand, if you do wish to get clever about actuators maybe check out this flexible PCB that is also its own actuator.

DIY Laser Tag Project Does it in Style

This DIY lasertag project designed by [Nii], which he brought to Tokyo Maker Faire back in September, is a treasure trove. It’s all in Japanese and you’ll need to visit X (formerly Twitter) to see it, but the images do a fine job of getting the essentials across and your favorite translator tool will do a fair job of the rest.

There’s a whole lot to admire in this project. The swing-out transparent OLED display is super slick, the electronics are housed on a single PCB, the back half of the grip is in fact a portable USB power bank that slots directly in to provide power, and there’s a really smart use of a short RGB LED strip for effects.

The optical elements show some inspired design, as well. An infrared LED points forward, and with the help of a lens, focuses the beam tightly enough to make aiming meaningful. For detecting hits, the top of the pistol conceals a custom-made reflector that directs any IR downward into a receiver, making it omnidirectional in terms of hit sensing but only needing a single sensor.

Want to know more? Check out [Nii]’s earlier prototypes on his website. It’s clear this has been in the works for a while, so if you like seeing how a project develops, you’re in for a treat.

As for the choice of transparent OLED displays? They are certainly cool, and we remember how wild it looks to have several stacked together.

❌