Falcon 3

TII (UAE) · December 2024

activeOpen Sourcedecoder onlytext
Parameters1B - 10B
Context Window8K tokens
Variants1B, 3B, 7B, 10B, Edge-1B (1-bit)

Why It Matters

Pivoted from massive scale to extreme efficiency, with experimental 1-bit models pushing the boundaries of how small and fast open-source AI can be.

Description

A complete strategic pivot from massive models to efficient small ones. Instead of scaling up, TII trained compact models (1B to 10B parameters) on 14 trillion tokens of data. Includes experimental 'Falcon-Edge' models using 1-bit quantization — an extreme compression technique where each model weight is stored as a single bit, enabling AI to run on the most resource-constrained devices.

Notable Milestones

  • 10B model led its category on the Open LLM Leaderboard
  • 1-bit Edge models for ultra-low-resource deployment

Key Innovations

Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Distillation
DistillationTraining a smaller 'student' model to mimic a larger 'teacher' model, preserving capability at lower cost.

Family Tree

Built On

Lineage

Falcon 40BFalcon 180BFalcon 3

External Links