Picture for Guilin Liu

Guilin Liu

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

Add code
Mar 05, 2026
Viaarxiv icon

Stateful Token Reduction for Long-Video Hybrid VLMs

Add code
Feb 27, 2026
Viaarxiv icon

PhyCritic: Multimodal Critic Models for Physical AI

Add code
Feb 11, 2026
Viaarxiv icon

OpenVision 3: A Family of Unified Visual Encoder for Both Understanding and Generation

Add code
Jan 21, 2026
Viaarxiv icon

NVIDIA Nemotron Nano V2 VL

Add code
Nov 07, 2025
Viaarxiv icon

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Add code
May 29, 2025
Viaarxiv icon

Nemotron-Research-Tool-N1: Tool-Using Language Models with Reinforced Reasoning

Add code
Apr 25, 2025
Viaarxiv icon

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Add code
Apr 21, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Figure 1 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 2 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 3 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Figure 4 for Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models
Viaarxiv icon

Slow-Fast Architecture for Video Multi-Modal Large Language Models

Add code
Apr 02, 2025
Viaarxiv icon