Nemotron-4 15B

NVIDIA · March 2024

activeOpen Weightdecoder onlytext
Parameters15B
Context Window8K tokens

Description

NVIDIA's multilingual language model trained on an enormous 8 trillion tokens of text — roughly 6 trillion words across multiple languages. Designed as a capable mid-size model for both research and enterprise deployment, it represents NVIDIA's push beyond hardware into building their own AI models.

Key Innovations

Autoregressive
AutoregressiveGenerates text one token at a time, each prediction based on all previous tokens. The foundation of modern language models.
Instruction Tuning
Instruction TuningFine-tuning a model on instruction-response pairs so it follows user commands more reliably.

Family Tree

Lineage

Megatron-Turing NLGNemotron-4 15B

Successors (1)

Related Research (1)

2019 · NVIDIA

Pioneered efficient model parallelism techniques enabling training of multi-billion parameter Transformers across GPUs.

External Links