MAI-1
Microsoft · May 2024
● activeCloseddecoder onlytext
Parameters~500B
Context Window128K tokens
Why It Matters
Marked Microsoft's first attempt at building its own frontier model independent of OpenAI, signaling the company's ambition to control its own AI destiny.
Description
Microsoft's first large-scale proprietary language model, reportedly around 500 billion parameters. Built under Mustafa Suleyman's leadership at Microsoft AI, it was designed to compete directly with GPT-4 and Gemini rather than focusing on small-model efficiency like the Phi series.
Key Innovations
Transformer
TransformerNeural network architecture using self-attention to process entire sequences in parallel. Replaced RNNs and enabled massive scaling.
Scaling Laws
Scaling LawsMathematical relationships showing how model performance improves predictably with more data, compute, and parameters.