Claude 3.5 Sonnet

Anthropic · June 2024

activeCloseddense transformermultimodalAPI Available
Parameters~175B
Context Window200K tokens

Why It Matters

Redefined expectations for mid-tier models by outperforming its own flagship (Opus) at a fraction of the cost. Became the most widely used Claude model and a developer favorite for coding tasks.

Description

A mid-tier model that punched well above its weight — matching or exceeding the more expensive Claude 3 Opus on most benchmarks while running faster and costing less. Quickly became the most popular model among developers for coding and complex tasks. Also introduced 'computer use' — the ability to interact with a computer screen like a human user.

Notable Milestones

  • Became the default model in Cursor and other AI coding tools
  • First model to offer 'computer use' — controlling a desktop like a human
  • Top-ranked on coding benchmarks like SWE-bench

Benchmark Scores

MMLUMassive Multitask Language Understanding — 57 subjects
88.7%
HumanEvalCode generation pass@1 — Python problems
92.0%
GPQAGraduate-level science QA
59.4%

Key Innovations

Code Gen
Code GenAbility to write, debug, and understand programming code across multiple languages.
Agentic
AgenticModels that can autonomously plan, execute multi-step tasks, use tools, and self-correct without human intervention.
Tool Use
Tool UseAbility to call external tools, APIs, and functions — enabling web browsing, code execution, and real-world actions.

Family Tree

Built On

Lineage

Claude 1.0Claude 2Claude 3Claude 3.5 Sonnet

Successors (1)

Related Research (1)

2022 · Anthropic

Introduced RL from AI Feedback using "constitutions" (rule sets) for self-supervision, reducing reliance on human labels for harmlessness training.

External Links