AI Cookbook
← Back to blog

Meta releases Llama 5 — 405B params open-weight, beats GPT-5.5 on most benchmarks, free for commercial use

Meta dropped Llama 5 this morning. Three sizes — 12B, 80B, and 405B parameters — released under a permissive license that allows commercial use up to 1 billion monthly active users. The 405B variant beats GPT-5.5 on MMLU, GSM8K, and HumanEval, while remaining downloadable from Hugging Face for any developer with the hardware to run it.

Mark Zuckerberg called it the "most important Llama release yet" in a Threads post — and for once, the executive hyperbole is hard to argue with.

The benchmarks that matter

Across standard evaluations, Llama 5 405B scores:

  • **MMLU**: 91.8% (GPT-5.5: 89.4%; Claude Opus 4.7: 90.3%)
  • **GSM8K** (grade school math): 96.2% (GPT-5.5: 95.8%)
  • **HumanEval+** (coding): 88.4% (Claude Sonnet 5: 90.4%; GPT-5.5: 86%)
  • **HellaSwag** (commonsense): 95.8%
  • **GPQA Diamond**: 67.4% (Claude Opus 4.7: 71%; o5: 73.2%)

It does not lead on every benchmark — reasoning still trails OpenAI o5 and Claude Opus — but for a freely-downloadable model, this is the first time an open-source release matches frontier closed models on most general-purpose tasks.

What you can actually do

Three things:

  • **Download and run locally**: 8x H100 minimum for the 405B at fp8; or 8x A100 for the 80B
  • **Fine-tune freely**: Meta releases LoRA training notebooks, full instruction-tuning data, and DPO datasets
  • **Deploy in production**: license allows up to 1B MAU before Meta requires negotiation; small businesses are essentially unrestricted

The open release includes pre-training data signatures (Meta still won't disclose the actual data) and full evaluation suites for reproducibility.

The open-source vs closed-source narrative

For two years, the conventional wisdom has been "open weights are 12-18 months behind closed labs". Llama 5 collapses that narrative. The 405B is essentially at frontier on general tasks the day it's released.

What's still meaningfully ahead in closed labs:

  • Reasoning models (OpenAI o5, Claude with extended thinking) maintain a 5-8 point lead on PhD-level science questions
  • Multimodal video processing (Gemini 3 Ultra) is uncatched in the open ecosystem
  • Tool-use and agent benchmarks favor closed APIs that can iterate quickly

For builders: Llama 5 80B is now the obvious default choice for most production workloads. Self-hosting cost is roughly 1/10th of Claude Sonnet API at scale, and the quality is within margin of error for most use cases.

Sources

  • Meta AI Blog (April 28, 2026): Introducing Llama 5
  • Hugging Face Llama 5 release page (April 28, 2026)
  • VentureBeat (April 28, 2026): Llama 5 405B beats GPT-5.5 on MMLU at zero API cost