Qwen 2

Alibaba Cloud · June 2024

activeOpen Weightdecoder onlytext
Parameters0.5B - 72B
Context Window128K tokens
Variants0.5B, 1.5B, 7B, 72B

Why It Matters

Established the Qwen series as a genuine rival to Meta's LLaMA for the open-source LLM crown, with best-in-class multilingual support across 29 languages.

Description

A ground-up architecture redesign that dramatically improved performance across the board. Available in sizes from 0.5B to 72B parameters with support for 29 languages and a 128K token context window (roughly 96,000 words). Emerged as the leading open-source alternative to Meta's LLaMA in many benchmarks.

Notable Milestones

  • Most widely used Chinese-English bilingual open model
  • Base model for numerous community fine-tunes

Key Innovations

Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Long Context
Long ContextAbility to process very long inputs (100K+ tokens), enabling analysis of entire codebases or books.

Family Tree

Built On

Lineage

QwenQwen 1.5Qwen 2

Successors (1)

Related Research (1)

RoPEArchitecture
2021 · Zhuiyi Technology

Introduced rotary position embeddings that encode position via rotation matrices, enabling better length generalization. Used by virtually every moder…