BERT
Google · October 2018
Why It Matters
Revolutionized NLP by introducing bidirectional context — understanding words based on BOTH their left and right context. Became the backbone of Google Search and spawned hundreds of derivatives.
Description
Google's Bidirectional Encoder Representations from Transformers — a model that reads text in both directions simultaneously (left-to-right AND right-to-left) to understand context, unlike GPT which only reads left-to-right. Trained by randomly hiding words and learning to predict them (called masked language modeling). Dominated NLP benchmarks for years and became the backbone of Google Search.
Notable Milestones
- ▸Powered Google Search's understanding of queries
- ▸Spawned RoBERTa, ALBERT, DistilBERT and hundreds of derivatives
- ▸Became the standard baseline for NLP research benchmarks
Key Innovations
Related Research (2)
Introduced the Transformer architecture using self-attention mechanisms, replacing RNNs entirely. Enabled parallel training and superior long-range de…
Encoder-only bidirectional pretraining with masked language modeling (MLM) and next-sentence prediction. Set SOTA on GLUE benchmarks.