GPT-4o

OpenAI · May 2024

deprecatedClosedmixture of expertsmultimodalAPI Available
Context Window128K tokens
Sunset DateFebruary 2026

Why It Matters

Made frontier-level AI available for free by dramatically reducing costs. First model to natively understand and generate across text, vision, and audio in real time, enabling natural voice conversations with AI.

Description

OpenAI's 'omni' model — the first to natively process text, images, and audio in a single unified architecture rather than using separate models stitched together. Significantly faster and cheaper than GPT-4 Turbo while matching its intelligence, making frontier AI capabilities accessible to free-tier users.

Notable Milestones

  • Enabled real-time voice conversations in ChatGPT
  • Made GPT-4-level intelligence available to free users
  • Powered the ChatGPT desktop app launch

Benchmark Scores

MMLUMassive Multitask Language Understanding — 57 subjects
88.7%
HumanEvalCode generation pass@1 — Python problems
90.2%
MATHMATH benchmark — competition-level problems
76.6%
GPQAGraduate-level science QA
53.6%

Key Innovations

Multimodal
MultimodalProcessing multiple types of input (text, images, audio, video) in a single model.

Related Research (1)

GPT-4Scaling
2023 · OpenAI

Described GPT-4's multimodal capabilities and performance across professional/academic benchmarks, setting new SOTA on bar exam, MMLU, and many others…

External Links