Picture for Jan Kautz

Jan Kautz

NVIDIA

Reinforcing Dual-Path Reasoning in Spatial Vision Language Models

Add code
Jun 16, 2026
Viaarxiv icon

ProCUA-SFT Technical Report

Add code
Jun 15, 2026
Viaarxiv icon

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Jun 12, 2026
Viaarxiv icon

GRAIL: Generating Humanoid Loco-Manipulation from 3D Assets and Video Priors

Add code
Jun 03, 2026
Viaarxiv icon

Cosmos 3: Omnimodal World Models for Physical AI

Add code
Jun 01, 2026
Viaarxiv icon

Scaling Parallel Sequence Models to Foundation-Scale Vision Encoders

Add code
May 30, 2026
Viaarxiv icon

Grounded 3D-Aware Spatial Vision-Language Modeling

Add code
May 28, 2026
Viaarxiv icon

LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding

Add code
May 27, 2026
Viaarxiv icon

Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention

Add code
May 21, 2026
Viaarxiv icon

Nemotron 3 Nano Omni: Efficient and Open Multimodal Intelligence

Add code
Apr 27, 2026
Viaarxiv icon