Mistral Large 2

Mistral AI · July 2024

activeOpen Weightdecoder onlytextAPI Available
Parameters123B
Context Window128K tokens

Why It Matters

Established Mistral AI as a credible competitor to frontier labs by delivering near-GPT-4-level performance in an open-weight package, particularly excelling in multilingual and code tasks.

Description

Mistral AI's 123 billion parameter flagship model, approaching the capabilities of the best closed models while remaining open-weight. Supports a 128K token context window (roughly 100,000 words) and excels at coding, reasoning, and multilingual tasks across dozens of languages.

Notable Milestones

  • Near-GPT-4 performance as an open-weight model
  • Strong multilingual support across dozens of languages
  • Powers Mistral's Le Chat conversational assistant

Benchmark Scores

MMLUMassive Multitask Language Understanding — 57 subjects
84.0%
HumanEvalCode generation pass@1 — Python problems
92.0%

Key Innovations

Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.

Family Tree

Built On

Lineage

Mistral 7BMixtral 8x7BMistral Large 2

Related Research (2)

2023 · Google Research

Introduced grouped-query attention as a middle ground between multi-head and multi-query attention, reducing KV cache memory while maintaining quality…

Mistral 7BScaling
2023 · Mistral AI

Introduced sliding window attention and demonstrated that a 7B model could outperform LLaMA 2 13B on all benchmarks.

External Links