LLaMA 3

Meta · April 2024

● activeOpen Weightdecoder onlytextAPI Available

Parameters8B / 70B

Context Window8K tokens

Variants8B, 70B

Why It Matters

Closed the quality gap between open and closed models, proving that openly available models could rival the best proprietary systems on many benchmarks.

Description

A major leap in open-model quality, available in 8B and 70B sizes. Trained on 15 trillion tokens of text data — roughly 7 times more than LLaMA 2 — which dramatically improved its ability to reason, write code, and follow instructions. Approached GPT-4-level performance on many tasks.

Notable Milestones

▸Approached GPT-4-class performance as an open model
▸Trained on 15T tokens — 7x more data than LLaMA 2
▸Widely deployed via Hugging Face and cloud providers

Key Innovations

Open Weight

Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.

Family Tree

Built On

LLaMA 2

Lineage

LLaMA→LLaMA 2→LLaMA 3

Successors (2)

LLaMA 3.1 LLaMA 3 8B Uncensored

Related Research (4)

LLaMAScaling

2023 · Meta AI

Showed that smaller models trained on significantly more data (following Chinchilla scaling laws) could match or exceed the performance of much larger…

RoPEArchitecture

2021 · Zhuiyi Technology

Introduced rotary position embeddings that encode position via rotation matrices, enabling better length generalization. Used by virtually every moder…

Grouped-Query AttentionArchitecture

2023 · Google Research

Introduced grouped-query attention as a middle ground between multi-head and multi-query attention, reducing KV cache memory while maintaining quality…

SwiGLUArchitecture

2020 · Google

Showed that SwiGLU activation (Swish + Gated Linear Unit) significantly improves Transformer FFN quality with minimal compute overhead.

External Links

Announcement

More from Meta LLaMA

LLaMA2023-02 · 7B - 65B

LLaMA 22023-07 · 7B - 70B

LLaMA 3.12024-07 · 8B / 70B / 405B

LLaMA 3.22024-09 · 1B / 3B / 11B / 90B

LLaMA 3.32024-12 · 70B

LLaMA 42025-04 · 17B active (Scout) / larger (Maverick)

MusicGen2023-06 · 3.3B

CodeLlama2023-08 · 7B - 70B

PreviousCodeLlama

NextLLaMA 3.1