DeepSeek R1
DeepSeek · January 2025
Why It Matters
First open-source reasoning model. Demonstrated that chain-of-thought reasoning could be trained into any model, not just proprietary ones. Its release democratized advanced reasoning capabilities.
Description
The first open-source reasoning model, rivaling OpenAI's o1. Uses chain-of-thought reasoning — the model 'thinks out loud' step by step before answering — trained purely through reinforcement learning (reward-based trial and error) without needing human-written examples. Also released distilled versions (smaller models trained to mimic R1's reasoning) as small as 1.5B parameters.
Notable Milestones
- ▸First open-weight model to match OpenAI o1 on reasoning benchmarks
- ▸Distilled versions brought reasoning to models as small as 1.5B parameters
- ▸Sparked a wave of open-source reasoning model development
Benchmark Scores
Key Innovations
Family Tree
Built On
Lineage
Successors (2)
Related Research (2)
Introduced Multi-head Latent Attention (MLA), which compresses the key-value cache into a low-rank latent space, dramatically reducing the memory need…
Demonstrated that pure RL training (without supervised fine-tuning on reasoning traces) can produce chain-of-thought reasoning, achieving performance …