Gemma

Google DeepMind · February 2024

activeOpen Sourcedecoder onlytext
Parameters2B / 7B
Context Window8K tokens
Variants2B, 7B

Why It Matters

Google's first serious open-weight competitor to Meta's Llama. Demonstrated that a major lab could release small, efficient models that outperformed many larger open alternatives.

Description

Google's first open-weight model family, built using the same research and technology behind the proprietary Gemini models but released for anyone to download, modify, and deploy. Available in 2B and 7B parameter sizes — small enough to run on a laptop or single GPU. Designed to make frontier AI research accessible to the broader developer community.

Notable Milestones

  • Widely adopted on Hugging Face for fine-tuning and research
  • Ran efficiently on consumer hardware including laptops

Key Innovations

Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Distillation
DistillationTraining a smaller 'student' model to mimic a larger 'teacher' model, preserving capability at lower cost.

Family Tree

Successors (2)

Related Research (1)

2023 · Google Research

Introduced grouped-query attention as a middle ground between multi-head and multi-query attention, reducing KV cache memory while maintaining quality…