Andrej Karpathy just dropped one of the most “unhinged” repos he’s ever written — NanoChat. It’s a from-scratch, minimal but complete ChatGPT-like system that teaches you how to train and run your own LLM for under $100. With only ~8 k lines of clean code, it covers everything, tokenizer (Rust-based), pre-training on FineWeb, mid-train on SmolTalk, SFT on MMLU / GSM8K / HumanEval, optional RL via GRPO, efficient inference with KV cache + prefill/decode, and a ChatGPT-style WebUI. (1/n)
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.
You can spin up a GPU, run the script, and have your own chatbot chatting in ~4 hours. Karpathy says the $100 model can already write poems & stories and surpass GPT-2 on CORE; $1000 brings it near GPT-3 Small 125M FLOPs with 40 + MMLU and 70 + ARC-Easy scores. The goal is a unified, readable, hackable repo that bundles the full “strong baseline pipeline”: a successor to nanoGPT and the backbone for his upcoming LLM101n course.
Even with a tiny budget, results are surprising. • $100 run (8×H100, 4 hrs): beats GPT-2 on CORE, can write poems & short stories. • $1000 run (≈24 hrs, GPT-3 Small 125 M FLOPs, 1/1000 scale): – MMLU 40 + – ARC-Easy 70 + – GSM8K 20 + it’s a genuine research-grade mini-pipeline.
When someone asked if NanoChat could be used to train a personal LLM (on Notion notes, health data, etc.), Karpathy poured cold water on the idea: “This is not a good repo for that… Think of these micro-models as very young children; they lack the raw intelligence of their larger cousins.” If you finetune them on personal data, you might get “cute parroting” that imitates your writing style, but it’ll still be slop. 🪿
Why personalization is hard To build a genuinely personalized model, you’d need to: • Prepare high-quality base data • Generate tons of synthetic data (complex + diverse) • Finetune on a strong open LLM (e.g. Tinker) • Possibly mix in large pre-training data to retain general intelligence That’s still research-grade territory today, not a weekend project.
Bigger picture Karpathy sees NanoChat as the new nanoGPT— a minimal yet complete framework that can grow into a standard baseline for LLM research, community collaboration, and education. Right now it’s not fully optimized, but the architecture is solid—ready for GitHub contributors to push it forward, module by module.
7,725
19
本页面内容由第三方提供。除非另有说明,欧易不是所引用文章的作者,也不对此类材料主张任何版权。该内容仅供参考,并不代表欧易观点,不作为任何形式的认可,也不应被视为投资建议或购买或出售数字资产的招揽。在使用生成式人工智能提供摘要或其他信息的情况下,此类人工智能生成的内容可能不准确或不一致。请阅读链接文章,了解更多详情和信息。欧易不对第三方网站上的内容负责。包含稳定币、NFTs 等在内的数字资产涉及较高程度的风险,其价值可能会产生较大波动。请根据自身财务状况,仔细考虑交易或持有数字资产是否适合您。