LLaMA 3.1

Meta · July 2024

● activeOpen Weightdecoder onlytextAPI Available

Parameters8B / 70B / 405B

Context Window128K tokens

Variants8B, 70B, 405B

Why It Matters

The 405B model was the largest openly available language model at launch, proving that open-weight models could compete head-to-head with the best closed systems like GPT-4o.

Description

Introduced the massive 405B parameter flagship — the largest openly available model at the time — alongside updated 8B and 70B versions. Extended context window to 128K tokens (roughly 100,000 words), enabling processing of entire books or large codebases in a single prompt. First open model to rival GPT-4o in overall capability.

Notable Milestones

▸Largest open-weight model at time of release (405B)
▸First open model to rival GPT-4o
▸Adopted as distillation teacher for smaller open models

Benchmark Scores

MMLUMassive Multitask Language Understanding — 57 subjects

88.6%

HumanEvalCode generation pass@1 — Python problems

85.3%

MATHMATH benchmark — competition-level problems

73.8%

Key Innovations

Open Weight

Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.

Long Context

Long ContextAbility to process very long inputs (100K+ tokens), enabling analysis of entire codebases or books.

Family Tree

Related Research (2)

RoPEArchitecture

2021 · Zhuiyi Technology

Introduced rotary position embeddings that encode position via rotation matrices, enabling better length generalization. Used by virtually every moder…

Grouped-Query AttentionArchitecture

2023 · Google Research

Introduced grouped-query attention as a middle ground between multi-head and multi-query attention, reducing KV cache memory while maintaining quality…