Falcon 3
TII (UAE) · December 2024
● activeOpen Sourcedecoder onlytext
Parameters1B - 10B
Context Window8K tokens
Variants1B, 3B, 7B, 10B, Edge-1B (1-bit)
Why It Matters
Pivoted from massive scale to extreme efficiency, with experimental 1-bit models pushing the boundaries of how small and fast open-source AI can be.
Description
A complete strategic pivot from massive models to efficient small ones. Instead of scaling up, TII trained compact models (1B to 10B parameters) on 14 trillion tokens of data. Includes experimental 'Falcon-Edge' models using 1-bit quantization — an extreme compression technique where each model weight is stored as a single bit, enabling AI to run on the most resource-constrained devices.
Notable Milestones
- ▸10B model led its category on the Open LLM Leaderboard
- ▸1-bit Edge models for ultra-low-resource deployment
Key Innovations
Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Distillation
DistillationTraining a smaller 'student' model to mimic a larger 'teacher' model, preserving capability at lower cost.