LLM Treeof Life

Nemotron-4 15B

NVIDIA · March 2024

● activeOpen Weightdecoder onlytext

Parameters15B

Context Window8K tokens

Description

NVIDIA's multilingual language model trained on an enormous 8 trillion tokens of text — roughly 6 trillion words across multiple languages. Designed as a capable mid-size model for both research and enterprise deployment, it represents NVIDIA's push beyond hardware into building their own AI models.

Key Innovations

Autoregressive

AutoregressiveGenerates text one token at a time, each prediction based on all previous tokens. The foundation of modern language models.

Instruction Tuning

Instruction TuningFine-tuning a model on instruction-response pairs so it follows user commands more reliably.

Family Tree

Built On

Megatron-Turing NLG

Lineage

Megatron-Turing NLG→Nemotron-4 15B

Successors (1)

Nemotron-4 340B

Related Research (1)

Megatron-LMScaling

2019 · NVIDIA

Pioneered efficient model parallelism techniques enabling training of multi-billion parameter Transformers across GPUs.

External Links

More from NVIDIA Nemotron

Megatron-Turing NLG2021-10 · 530B

Nemotron-4 340B2024-06 · 340B

Llama-3.1-Nemotron-70B2024-10 · 70B

NVLM 1.02024-10 · 72B

Nemotron 3 Nano2025-12 · 30B (3B active)

Nemotron 3 Super2026-03 · 120B (12B active)

Nemotron 3 Ultra2026-05 · 550B (55B active)

Cosmos 1.02025-01 · —

PreviousMegatron-Turing NLG

NextNemotron-4 340B