LLaMA 3.1

Meta · July 2024

activeOpen Weightdecoder onlytextAPI Available
Parameters8B / 70B / 405B
Context Window128K tokens
Variants8B, 70B, 405B

Why It Matters

The 405B model was the largest openly available language model at launch, proving that open-weight models could compete head-to-head with the best closed systems like GPT-4o.

Description

Introduced the massive 405B parameter flagship — the largest openly available model at the time — alongside updated 8B and 70B versions. Extended context window to 128K tokens (roughly 100,000 words), enabling processing of entire books or large codebases in a single prompt. First open model to rival GPT-4o in overall capability.

Notable Milestones

  • Largest open-weight model at time of release (405B)
  • First open model to rival GPT-4o
  • Adopted as distillation teacher for smaller open models

Benchmark Scores

MMLUMassive Multitask Language Understanding — 57 subjects
88.6%
HumanEvalCode generation pass@1 — Python problems
85.3%
MATHMATH benchmark — competition-level problems
73.8%

Key Innovations

Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Long Context
Long ContextAbility to process very long inputs (100K+ tokens), enabling analysis of entire codebases or books.

Related Research (2)

RoPEArchitecture
2021 · Zhuiyi Technology

Introduced rotary position embeddings that encode position via rotation matrices, enabling better length generalization. Used by virtually every moder…

2023 · Google Research

Introduced grouped-query attention as a middle ground between multi-head and multi-query attention, reducing KV cache memory while maintaining quality…