Vista Normal

Hay nuevos artículos disponibles. Pincha para refrescar la página.
AnteayerSalida Principal

Mistral AI

Por: EasyWithAI
11 Enero 2024 at 14:42
Mistral AI is a large language model and chat assistant tool. You can access the chatbot via the Mitral website by clicking on “Talk to le Chat“, or if you prefer a local setup then you can download and run the model files on your own hardware. The creators of Mistral describe it as an […]

Source

Athina AI

Por: EasyWithAI
7 Marzo 2024 at 13:36
Athina is a powerful monitoring and evaluation platform designed for companies deploying large language models (LLMs) in production environments. Its main use case is to allow users to detect hallucinations, analyze their LLM accuracy, and debug outputs through features like prompt management, performance tracking over time, and custom evaluation metrics. Athina integrates seamlessly with popular […]

Source

Stable Beluga 2

Por: EasyWithAI
31 Julio 2023 at 19:31
Stable Beluga 2 is a new open-source LLM developed by Stability AI and is based off of the LLamA-2 model by Meta AI with 70 billion parameters. This LLM is currently leading the chart on Hugging Face’s Open LLM Leaderboard. Like most other LLMs, you’ll need an interface installed to run Stable Beluga 2 on […]

Source

Stability AI, Team Behind Stable Diffusion Announces First LLM With ChatGPT-Like Capabilities

Por: EasyWithAI
20 Abril 2023 at 00:11
Stability AI, the team behind the popular AI art tool Stable Diffusion, has announced the launch of its latest creation: StableLM, a suite of text-generating AI models designed to rival systems like OpenAI’s GPT-4 and ChatGPT. Available in “alpha” on GitHub and Hugging Face, StableLM can generate both code and text and has been trained […]

Source

Code Llama

Por: EasyWithAI
19 Septiembre 2023 at 13:50
Code Llama is a suite of large language models released by Meta AI for generating and enhancing code. It includes foundation models for general coding, Python specializations, and models tailored for following instructions. Key features include state-of-the-art performance, code infilling, large context support up to 100K tokens, and zero-shot ability to follow instructions for programming […]

Source

ChatGLM-6B

Por: EasyWithAI
18 Septiembre 2023 at 18:02
ChatGLM-6B is an open-source, bilingual conversational AI LLM based on the General Language Model (GLM) framework. It has 6.2 billion parameters and can be deployed locally with only 6GB of GPU memory. This model allows for natural language processing in both Chinese and English, question answering, task-oriented dialogue, and easy integration via API and demo […]

Source

Infermatic

Por: EasyWithAI
19 Enero 2024 at 14:26
Infermatic offers developers and researchers seamless access to leading large language models through a unified platform. Its user-friendly design makes AI experimentation easy for anyone while still providing advanced users with enterprise-scale capabilities. Infermatic’s free version, TotalGPT Free, offers up to 300 requests per day with a 60 token limit. You can check out the […]

Source

Perplexity AI

Por: EasyWithAI
4 Mayo 2023 at 01:25
Perplexity AI is an AI chat and search engine that uses advanced technology to provide direct answers to your queries. It delivers accurate answers using large language models and even includes links to citations and related topics. It is available for free via web browser and also on mobile via the Apple App Store. Using […]

Source

NetBSD Bans AI-Generated Code From Commits

Por: Maya Posch
18 Mayo 2024 at 08:00

A recent change was announced to the NetBSD commit guidelines which amends these to state that code which was generated by Large Language Models (LLMs) or similar technologies, such as ChatGPT, Microsoft’s Copilot or Meta’s Code Llama is presumed to be tainted code. This amendment was to the existing section about tainted code, which originally referred to any code that was not written directly by the person committing the code, and was due to licensing concerns. The obvious reason behind this is that otherwise code may be copied into the NetBSD codebase which may have been licensed under an incompatible (or proprietary) license.

In the case of LLM-based code generators like the above-mentioned, the problem stems from the fact that they are trained on millions of lines of code from all over the internet, which are naturally released under a wide variety of licenses. Invariably, some of that code will be covered by a license that’s not acceptable for the NetBSD codebase. Although the guideline mentions that these auto-generated code commits may still be admissible, they require written permission from core developers, and presumably an in-depth audit of the code’s heritage. This should leave non-trivial commits that got churned out by ChatGPT and kin out in the cold.

The debate about the validity of works produced by current-gen “artificial intelligence” software is only just beginning, but there’s little question that NetBSD has made the right call here. From a legal and software engineering perspective this policy makes perfect sense, as LLM-generated code simply doesn’t meet the project’s standards. That said, code produced by humans brings with it a whole different set of potential problems.

Train a GPT-2 LLM, Using Only Pure C Code

28 Abril 2024 at 08:00

[Andrej Karpathy] recently released llm.c, a project that focuses on LLM training in pure C, once again showing that working with these tools isn’t necessarily reliant on sprawling development environments. GPT-2 may be older but is perfectly relevant, being the granddaddy of modern LLMs (large language models) with a clear heritage to more modern offerings.

LLMs are fantastically good at communicating despite not actually knowing what they are saying, and training them usually relies on PyTorch deep learning library, itself written in Python. llm.c takes a simpler approach by implementing the neural network training algorithm for GPT-2 directly. The result is highly focused and surprisingly short: about a thousand lines of C in a single file. It is a highly elegant process that does the same thing the bigger, clunkier methods accomplish. It can run entirely on a CPU, or it can take advantage of GPU acceleration, where available.

This isn’t the first time [Andrej Karpathy] has bent his considerable skills and understanding towards boiling down these sorts of concepts into bare-bones implementations. We previously covered a project of his that is the “hello world” of GPT, a tiny model that predicts the next bit in a given sequence and offers low-level insight into just how GPT (generative pre-trained transformer) models work.

❌
❌