GPT-3

OpenAI · June 2020

legacyCloseddecoder onlytextAPI Available
Parameters175B
Context Window2K tokens
VariantsAda, Babbage, Curie, Davinci
Sunset DateJanuary 2024

Why It Matters

The model that launched the modern AI era. Its 175 billion parameters showed that scaling up language models dramatically improves capabilities, enabling surprisingly human-like text generation and few-shot learning.

Description

With 175 billion parameters — over 100× larger than GPT-2 — this model demonstrated 'few-shot learning': the ability to perform new tasks from just a few examples in the prompt, without any retraining. This was the model that launched the modern AI era and proved that scale alone could unlock remarkable capabilities.

Notable Milestones

  • Powered the first wave of AI writing assistants
  • Spawned hundreds of startups built on the GPT-3 API
  • Demonstrated few-shot learning — performing tasks from a handful of examples

Key Innovations

Few-Shot
Few-ShotLearning from just a handful of examples provided in the prompt, without retraining.
Scaling Laws
Scaling LawsMathematical relationships showing how model performance improves predictably with more data, compute, and parameters.

Family Tree

Built On

Lineage

GPT-1GPT-2GPT-3

Related Research (3)

TransformerTransformer
2017 · Google Brain

Introduced the Transformer architecture using self-attention mechanisms, replacing RNNs entirely. Enabled parallel training and superior long-range de…

GPT-3Transformer
2020 · OpenAI

175B-parameter GPT. Pioneered few-shot and in-context learning, dramatically reducing the need for fine-tuning.

2020 · OpenAI

Found that model performance follows power laws in compute, parameters, and data. Provided the mathematical framework for scaling decisions.

Enabled By

A100NVIDIA · May 2020
312 TFLOPS FP16 Tensor

External Links