Falcon 180B
TII (UAE) · September 2023
◌ legacyOpen Weightdecoder onlytext
Parameters180B
Context Window2K tokens
Why It Matters
Largest open-source model at the time of release, demonstrating that open models could approach GPT-4-level performance.
Description
At 180 billion parameters, this was the largest open-source model in the world when released. Trained on 3.5 trillion tokens of RefinedWeb data, it approached GPT-4-level performance on many benchmarks while being freely available. Required significant computing resources to run but demonstrated what was possible at scale with open models.
Notable Milestones
- ▸Largest open-source model at release
- ▸Near-GPT-4 performance on several benchmarks
Key Innovations
Open Weight
Open WeightModel weights are publicly released but training data/code may not be. Enables fine-tuning but not full reproduction.
Scaling Laws
Scaling LawsMathematical relationships showing how model performance improves predictably with more data, compute, and parameters.