Picture for Jan Kautz

Jan Kautz

NVIDIA

NVILA: Efficient Frontier Visual Language Models

Add code
Dec 05, 2024
Figure 1 for NVILA: Efficient Frontier Visual Language Models
Figure 2 for NVILA: Efficient Frontier Visual Language Models
Figure 3 for NVILA: Efficient Frontier Visual Language Models
Figure 4 for NVILA: Efficient Frontier Visual Language Models
Viaarxiv icon

NaVILA: Legged Robot Vision-Language-Action Model for Navigation

Add code
Dec 05, 2024
Figure 1 for NaVILA: Legged Robot Vision-Language-Action Model for Navigation
Figure 2 for NaVILA: Legged Robot Vision-Language-Action Model for Navigation
Figure 3 for NaVILA: Legged Robot Vision-Language-Action Model for Navigation
Figure 4 for NaVILA: Legged Robot Vision-Language-Action Model for Navigation
Viaarxiv icon

Hymba: A Hybrid-head Architecture for Small Language Models

Add code
Nov 20, 2024
Figure 1 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 2 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 3 for Hymba: A Hybrid-head Architecture for Small Language Models
Figure 4 for Hymba: A Hybrid-head Architecture for Small Language Models
Viaarxiv icon

HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots

Add code
Oct 28, 2024
Figure 1 for HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
Figure 2 for HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
Figure 3 for HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
Figure 4 for HOVER: Versatile Neural Whole-Body Controller for Humanoid Robots
Viaarxiv icon

EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximation

Add code
Oct 28, 2024
Viaarxiv icon

nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric Vision

Add code
Oct 15, 2024
Figure 1 for nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric Vision
Figure 2 for nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric Vision
Figure 3 for nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric Vision
Figure 4 for nvTorchCam: An Open-source Library for Camera-Agnostic Differentiable Geometric Vision
Viaarxiv icon

Exploring the design space of deep-learning-based weather forecasting systems

Add code
Oct 09, 2024
Figure 1 for Exploring the design space of deep-learning-based weather forecasting systems
Figure 2 for Exploring the design space of deep-learning-based weather forecasting systems
Figure 3 for Exploring the design space of deep-learning-based weather forecasting systems
Figure 4 for Exploring the design space of deep-learning-based weather forecasting systems
Viaarxiv icon

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Add code
Sep 26, 2024
Figure 1 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 2 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 3 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Figure 4 for MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Viaarxiv icon

COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation

Add code
Aug 29, 2024
Figure 1 for COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Figure 2 for COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Figure 3 for COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Figure 4 for COIN: Control-Inpainting Diffusion Prior for Human and Camera Motion Estimation
Viaarxiv icon

Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders

Add code
Aug 28, 2024
Figure 1 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 2 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 3 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Figure 4 for Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Viaarxiv icon