An Animated Walkthrough of How Large Language Models Work

20 Noviembre 2024 at 12:00

If you wonder how Large Language Models (LLMs) work and aren’t afraid of getting a bit technical, don’t miss [Brendan Bycroft]’s LLM Visualization. It is an interactively-animated step-by-step walk-through of a GPT large language model complete with animated and interactive 3D block diagram of everything going on under the hood. Check it out!

nano-gpt has only around 85,000 parameters, but the operating principles are all the same as for larger models.

The demonstration walks through a simple task and shows every step. The task is this: using the nano-gpt model, take a sequence of six letters and put them into alphabetical order.

A GPT model is a highly complex prediction engine, so the whole process begins with tokenizing the input (breaking up words and assigning numerical values to the chunks) and ends with choosing an appropriate output from a list of probabilities. There are of course many more steps in between, and different ways to adjust the model’s behavior. All of these are made quite clear by [Brendan]’s process breakdown.

We’ve previously covered how LLMs work, explained without math which eschews gritty technical details in favor of focusing on functionality, but it’s also nice to see an approach like this one, which embraces the technical elements of exactly what is going on.

We’ve also seen a much higher-level peek at how a modern AI model like Anthropic’s Claude works when it processes requests, extracting human-understandable concepts that illustrate what’s going on under the hood.

Schooling ChatGPT on Antenna Theory Misconceptions

Hackaday

Dan Maloney

17 Noviembre 2024 at 18:00

We’re not very far into the AI revolution at this point, but we’re far enough to know not to trust AI implicitly. If you accept what ChatGPT or any of the other AI chatbots have to say at face value, you might just embarrass yourself. Or worse, you might make a mistake designing your next antenna.

We’ll explain. [Gregg Messenger (VE6WO)] asked a seemingly simple question about antenna theory: Does an impedance mismatch between the antenna and a coaxial feedline result in common-mode current on the coax shield? It’s an important practical matter, as any ham who has had the painful experience of “RF in the shack” can tell you. They also will likely tell you that common-mode current on the shield is caused by an unbalanced antenna system, not an impedance mismatch. But when [Gregg] asked Google Gemini and ChatGPT that question, the answer came back that impedance mismatch can cause current flow on the shield. So who’s right?

In the first video below, [Gregg] built a simulated ham shack using a 100-MHz signal generator and a length of coaxial feedline. Using a toroidal ferrite core with a couple of turns of magnet wire and a capacitor as a current probe for his oscilloscope, he was unable to find a trace of the signal on the shield even if the feedline was unterminated, which produces the impedance mismatch that the chatbots thought would spell doom. To bring the point home, [Gregg] created another test setup in the second video, this time using a pair of telescoping whip antennas to stand in for a dipole antenna. With the coax connected directly to the dipole, which creates an unbalanced system, he measured a current on the feedline, which got worse when he further unbalanced the system by removing one of the legs. Adding a balun between the feedline and the antenna, which shifts the phase on each leg of the antenna 180° apart, cured the problem.

We found these demonstrations quite useful. It’s always good to see someone taking a chatbot to task over myths and common misperceptions. We look into baluns now and again. Or even ununs.

Vista de Lectura