Picture for Zhiding Yu

Zhiding Yu

FRAG: Frame Selection Augmented Generation for Long Video and Long Document Understanding

Add code
Apr 24, 2025
Viaarxiv icon

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Add code
Apr 21, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Viaarxiv icon

OmniDrive: A Holistic Vision-Language Dataset for Autonomous Driving with Counterfactual Reasoning

Add code
Apr 06, 2025
Viaarxiv icon

Slow-Fast Architecture for Video Multi-Modal Large Language Models

Add code
Apr 02, 2025
Viaarxiv icon

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots

Add code
Mar 18, 2025
Viaarxiv icon

Hydra-MDP++: Advancing End-to-End Driving via Expert-Guided Hydra-Distillation

Add code
Mar 17, 2025
Viaarxiv icon

Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training

Add code
Mar 15, 2025
Viaarxiv icon

Centaur: Robust End-to-End Autonomous Driving with Test-Time Training

Add code
Mar 14, 2025
Viaarxiv icon

Token-Efficient Long Video Understanding for Multimodal LLMs

Add code
Mar 06, 2025
Viaarxiv icon