Picture for Roger Waleffe

Roger Waleffe

LatentMoE: Toward Optimal Accuracy per FLOP and Parameter in Mixture of Experts

Add code
Jan 26, 2026
Viaarxiv icon

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

NVIDIA Nemotron Nano V2 VL

Add code
Nov 07, 2025
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

Llama-Nemotron: Efficient Reasoning Models

Add code
May 02, 2025
Figure 1 for Llama-Nemotron: Efficient Reasoning Models
Figure 2 for Llama-Nemotron: Efficient Reasoning Models
Figure 3 for Llama-Nemotron: Efficient Reasoning Models
Figure 4 for Llama-Nemotron: Efficient Reasoning Models
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Figure 1 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 2 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 3 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 4 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Viaarxiv icon

MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage

Add code
Apr 02, 2025
Figure 1 for MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage
Figure 2 for MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage
Figure 3 for MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage
Figure 4 for MLKV: Efficiently Scaling up Large Embedding Model Training with Disk-based Key-Value Storage
Viaarxiv icon

Armada: Memory-Efficient Distributed Training of Large-Scale Graph Neural Networks

Add code
Feb 25, 2025
Viaarxiv icon

GraphSnapShot: Graph Machine Learning Acceleration with Fast Storage and Retrieval

Add code
Jun 25, 2024
Figure 1 for GraphSnapShot: Graph Machine Learning Acceleration with Fast Storage and Retrieval
Figure 2 for GraphSnapShot: Graph Machine Learning Acceleration with Fast Storage and Retrieval
Figure 3 for GraphSnapShot: Graph Machine Learning Acceleration with Fast Storage and Retrieval
Figure 4 for GraphSnapShot: Graph Machine Learning Acceleration with Fast Storage and Retrieval
Viaarxiv icon