Vista de Lectura

Hay nuevos artículos disponibles. Pincha para refrescar la página.

Mistral AI

Mistral AI is a large language model and chat assistant tool. You can access the chatbot via the Mitral website by clicking on “Talk to le Chat“, or if you prefer a local setup then you can download and run the model files on your own hardware. The creators of Mistral describe it as an […]

Source

Athina AI

Athina is a powerful monitoring and evaluation platform designed for companies deploying large language models (LLMs) in production environments. Its main use case is to allow users to detect hallucinations, analyze their LLM accuracy, and debug outputs through features like prompt management, performance tracking over time, and custom evaluation metrics. Athina integrates seamlessly with popular […]

Source

Stable Beluga 2

Stable Beluga 2 is a new open-source LLM developed by Stability AI and is based off of the LLamA-2 model by Meta AI with 70 billion parameters. This LLM is currently leading the chart on Hugging Face’s Open LLM Leaderboard. Like most other LLMs, you’ll need an interface installed to run Stable Beluga 2 on […]

Source

Code Llama

Code Llama is a suite of large language models released by Meta AI for generating and enhancing code. It includes foundation models for general coding, Python specializations, and models tailored for following instructions. Key features include state-of-the-art performance, code infilling, large context support up to 100K tokens, and zero-shot ability to follow instructions for programming […]

Source

ChatGLM-6B

ChatGLM-6B is an open-source, bilingual conversational AI LLM based on the General Language Model (GLM) framework. It has 6.2 billion parameters and can be deployed locally with only 6GB of GPU memory. This model allows for natural language processing in both Chinese and English, question answering, task-oriented dialogue, and easy integration via API and demo […]

Source

Infermatic

Infermatic offers developers and researchers seamless access to leading large language models through a unified platform. Its user-friendly design makes AI experimentation easy for anyone while still providing advanced users with enterprise-scale capabilities. Infermatic’s free version, TotalGPT Free, offers up to 300 requests per day with a 60 token limit. You can check out the […]

Source

Perplexity AI

Perplexity AI is an AI chat and search engine that uses advanced technology to provide direct answers to your queries. It delivers accurate answers using large language models and even includes links to citations and related topics. It is available for free via web browser and also on mobile via the Apple App Store. Using […]

Source

NetBSD Bans AI-Generated Code From Commits

A recent change was announced to the NetBSD commit guidelines which amends these to state that code which was generated by Large Language Models (LLMs) or similar technologies, such as ChatGPT, Microsoft’s Copilot or Meta’s Code Llama is presumed to be tainted code. This amendment was to the existing section about tainted code, which originally referred to any code that was not written directly by the person committing the code, and was due to licensing concerns. The obvious reason behind this is that otherwise code may be copied into the NetBSD codebase which may have been licensed under an incompatible (or proprietary) license.

In the case of LLM-based code generators like the above-mentioned, the problem stems from the fact that they are trained on millions of lines of code from all over the internet, which are naturally released under a wide variety of licenses. Invariably, some of that code will be covered by a license that’s not acceptable for the NetBSD codebase. Although the guideline mentions that these auto-generated code commits may still be admissible, they require written permission from core developers, and presumably an in-depth audit of the code’s heritage. This should leave non-trivial commits that got churned out by ChatGPT and kin out in the cold.

The debate about the validity of works produced by current-gen “artificial intelligence” software is only just beginning, but there’s little question that NetBSD has made the right call here. From a legal and software engineering perspective this policy makes perfect sense, as LLM-generated code simply doesn’t meet the project’s standards. That said, code produced by humans brings with it a whole different set of potential problems.

Train a GPT-2 LLM, Using Only Pure C Code

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.

❌