Meta dropped Llama 5 this morning. Three sizes — 12B, 80B, and 405B parameters — released under a permissive license that allows commercial use up to 1 billion monthly active users. The 405B variant beats GPT-5.5 on MMLU, GSM8K, and HumanEval, while remaining downloadable from Hugging Face for any developer with the hardware to run it.
Mark Zuckerberg called it the "most important Llama release yet" in a Threads post — and for once, the executive hyperbole is hard to argue with.
The benchmarks that matter
Across standard evaluations, Llama 5 405B scores:
- **MMLU**: 91.8% (GPT-5.5: 89.4%; Claude Opus 4.7: 90.3%)
- **GSM8K** (grade school math): 96.2% (GPT-5.5: 95.8%)
- **HumanEval+** (coding): 88.4% (Claude Sonnet 5: 90.4%; GPT-5.5: 86%)
- **HellaSwag** (commonsense): 95.8%
- **GPQA Diamond**: 67.4% (Claude Opus 4.7: 71%; o5: 73.2%)
It does not lead on every benchmark — reasoning still trails OpenAI o5 and Claude Opus — but for a freely-downloadable model, this is the first time an open-source release matches frontier closed models on most general-purpose tasks.
What you can actually do
Three things:
- **Download and run locally**: 8x H100 minimum for the 405B at fp8; or 8x A100 for the 80B
- **Fine-tune freely**: Meta releases LoRA training notebooks, full instruction-tuning data, and DPO datasets
- **Deploy in production**: license allows up to 1B MAU before Meta requires negotiation; small businesses are essentially unrestricted
The open release includes pre-training data signatures (Meta still won't disclose the actual data) and full evaluation suites for reproducibility.
The open-source vs closed-source narrative
For two years, the conventional wisdom has been "open weights are 12-18 months behind closed labs". Llama 5 collapses that narrative. The 405B is essentially at frontier on general tasks the day it's released.
What's still meaningfully ahead in closed labs:
- Reasoning models (OpenAI o5, Claude with extended thinking) maintain a 5-8 point lead on PhD-level science questions
- Multimodal video processing (Gemini 3 Ultra) is uncatched in the open ecosystem
- Tool-use and agent benchmarks favor closed APIs that can iterate quickly
For builders: Llama 5 80B is now the obvious default choice for most production workloads. Self-hosting cost is roughly 1/10th of Claude Sonnet API at scale, and the quality is within margin of error for most use cases.
Sources
- Meta AI Blog (April 28, 2026): Introducing Llama 5
- Hugging Face Llama 5 release page (April 28, 2026)
- VentureBeat (April 28, 2026): Llama 5 405B beats GPT-5.5 on MMLU at zero API cost