Picture for Shrimai Prabhumoye

Shrimai Prabhumoye

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Add code
Aug 20, 2025
Viaarxiv icon

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Add code
May 26, 2025
Viaarxiv icon

Llama-Nemotron: Efficient Reasoning Models

Add code
May 02, 2025
Figure 1 for Llama-Nemotron: Efficient Reasoning Models
Figure 2 for Llama-Nemotron: Efficient Reasoning Models
Figure 3 for Llama-Nemotron: Efficient Reasoning Models
Figure 4 for Llama-Nemotron: Efficient Reasoning Models
Viaarxiv icon

NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning

Add code
Apr 15, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Figure 1 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 2 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 3 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 4 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Viaarxiv icon

Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning

Add code
Apr 06, 2025
Figure 1 for Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning
Figure 2 for Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning
Figure 3 for Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning
Figure 4 for Retro-Search: Exploring Untaken Paths for Deeper and Efficient Reasoning
Viaarxiv icon

Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining

Add code
Dec 18, 2024
Figure 1 for Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
Figure 2 for Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
Figure 3 for Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
Figure 4 for Maximize Your Data's Potential: Enhancing LLM Accuracy with Two-Phase Pretraining
Viaarxiv icon