Gemini 1.0
Google DeepMind · December 2023
Why It Matters
Google's answer to GPT-4 and the first major model designed as multimodal from inception rather than adding vision as an afterthought. Gemini Ultra briefly claimed the top spot on key benchmarks.
Description
Google's first natively multimodal model family — built from the ground up to understand text, images, audio, and video together rather than bolting on capabilities after the fact. Available in three tiers: Ultra (most capable), Pro (balanced), and Nano (designed to run on mobile phones). Replaced PaLM as Google's flagship AI.
Notable Milestones
- ▸Replaced PaLM as the engine behind Google's AI products
- ▸Nano variant designed for on-device use on Pixel phones
- ▸Ultra was first model to achieve human-expert level on MMLU benchmark
Key Innovations
Related Research (2)
Combined chain-of-thought reasoning with external tool use (APIs, search), improving QA and decision-making through interleaved reasoning and action.
Introduced the Gemini family with native multimodal training from the ground up, achieving SOTA on 30+ benchmarks.