Picture for Mohammad Shoeybi

Mohammad Shoeybi

NVIDIA

Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation

Add code
Mar 19, 2026
Viaarxiv icon

MMOU: A Massive Multi-Task Omni Understanding and Reasoning Benchmark for Long and Complex Real-World Videos

Add code
Mar 14, 2026
Viaarxiv icon

Scalable Training of Mixture-of-Experts Models with Megatron Core

Add code
Mar 10, 2026
Viaarxiv icon

On Data Engineering for Scaling LLM Terminal Capabilities

Add code
Feb 24, 2026
Viaarxiv icon

LatentMoE: Toward Optimal Accuracy per FLOP and Parameter in Mixture of Experts

Add code
Jan 26, 2026
Viaarxiv icon

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models

Add code
Dec 15, 2025
Figure 1 for Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
Figure 2 for Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
Figure 3 for Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
Figure 4 for Nemotron-Cascade: Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models
Viaarxiv icon

Music Flamingo: Scaling Music Understanding in Audio Language Models

Add code
Nov 13, 2025
Viaarxiv icon

NVIDIA Nemotron Nano V2 VL

Add code
Nov 07, 2025
Viaarxiv icon