Picture for Jan Kautz

Jan Kautz

NVIDIA

ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge

Add code
Oct 21, 2025
Figure 1 for ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
Figure 2 for ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
Figure 3 for ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
Figure 4 for ProfBench: Multi-Domain Rubrics requiring Professional Knowledge to Answer and Judge
Viaarxiv icon

3D Aware Region Prompted Vision Language Model

Add code
Sep 16, 2025
Figure 1 for 3D Aware Region Prompted Vision Language Model
Figure 2 for 3D Aware Region Prompted Vision Language Model
Figure 3 for 3D Aware Region Prompted Vision Language Model
Figure 4 for 3D Aware Region Prompted Vision Language Model
Viaarxiv icon

NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model

Add code
Aug 21, 2025
Figure 1 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 2 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 3 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Figure 4 for NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model
Viaarxiv icon

HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis

Add code
Aug 12, 2025
Viaarxiv icon

Scaling RL to Long Videos

Add code
Jul 10, 2025
Viaarxiv icon

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Add code
May 30, 2025
Viaarxiv icon

AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion

Add code
May 30, 2025
Figure 1 for AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion
Figure 2 for AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion
Figure 3 for AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion
Figure 4 for AdaHuman: Animatable Detailed 3D Human Generation with Compositional Multiview Diffusion
Viaarxiv icon

GeoMan: Temporally Consistent Human Geometry Estimation using Image-to-Video Diffusion

Add code
May 29, 2025
Viaarxiv icon

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Add code
May 29, 2025
Viaarxiv icon

FLARE: Robot Learning with Implicit World Modeling

Add code
May 21, 2025
Viaarxiv icon