Picture for Mostofa Patwary

Mostofa Patwary

Celine

Nemotron-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context

Add code
Jun 25, 2026
Viaarxiv icon

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Jun 12, 2026
Viaarxiv icon

Cosmos 3: Omnimodal World Models for Physical AI

Add code
Jun 01, 2026
Viaarxiv icon

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Add code
Apr 27, 2026
Viaarxiv icon

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Apr 14, 2026
Viaarxiv icon

LatentMoE: Toward Optimal Accuracy per FLOP and Parameter in Mixture of Experts

Add code
Jan 26, 2026
Viaarxiv icon

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

NVIDIA Nemotron Nano V2 VL

Add code
Nov 07, 2025
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon