Mistral Large 2
Mistral AI · July 2024
Why It Matters
Established Mistral AI as a credible competitor to frontier labs by delivering near-GPT-4-level performance in an open-weight package, particularly excelling in multilingual and code tasks.
Description
Mistral AI's 123 billion parameter flagship model, approaching the capabilities of the best closed models while remaining open-weight. Supports a 128K token context window (roughly 100,000 words) and excels at coding, reasoning, and multilingual tasks across dozens of languages.
Notable Milestones
- ▸Near-GPT-4 performance as an open-weight model
- ▸Strong multilingual support across dozens of languages
- ▸Powers Mistral's Le Chat conversational assistant
Benchmark Scores
Key Innovations
Family Tree
Built On
Lineage
Successors (2)
Related Research (2)
Introduced grouped-query attention as a middle ground between multi-head and multi-query attention, reducing KV cache memory while maintaining quality…
Introduced sliding window attention and demonstrated that a 7B model could outperform LLaMA 2 13B on all benchmarks.