Picture for Rabeeh Karimi Mahabadi

Rabeeh Karimi Mahabadi

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset

Add code
Aug 20, 2025
Viaarxiv icon

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

Add code
May 15, 2023
Figure 1 for TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Figure 2 for TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Figure 3 for TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Figure 4 for TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Viaarxiv icon

PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models

Add code
Apr 03, 2022
Figure 1 for PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
Figure 2 for PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
Figure 3 for PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
Figure 4 for PERFECT: Prompt-free and Efficient Few-shot Learning with Language Models
Viaarxiv icon

Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

Add code
Jun 10, 2021
Figure 1 for Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Figure 2 for Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Figure 3 for Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Figure 4 for Variational Information Bottleneck for Effective Low-Resource Fine-Tuning
Viaarxiv icon

Compacter: Efficient Low-Rank Hypercomplex Adapter Layers

Add code
Jun 08, 2021
Figure 1 for Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Figure 2 for Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Figure 3 for Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Figure 4 for Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Viaarxiv icon

Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks

Add code
Jun 08, 2021
Figure 1 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Figure 2 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Figure 3 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Figure 4 for Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Viaarxiv icon

ParsiNLU: A Suite of Language Understanding Challenges for Persian

Add code
Dec 11, 2020
Figure 1 for ParsiNLU: A Suite of Language Understanding Challenges for Persian
Figure 2 for ParsiNLU: A Suite of Language Understanding Challenges for Persian
Figure 3 for ParsiNLU: A Suite of Language Understanding Challenges for Persian
Figure 4 for ParsiNLU: A Suite of Language Understanding Challenges for Persian
Viaarxiv icon