Picture for Guo Chen

Guo Chen

Bridging Perspectives: A Survey on Cross-view Collaborative Intelligence with Egocentric-Exocentric Vision

Add code
Jun 06, 2025
Viaarxiv icon

AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs

Add code
Jun 05, 2025
Viaarxiv icon

Time-Frequency-Based Attention Cache Memory Model for Real-Time Speech Separation

Add code
May 19, 2025
Viaarxiv icon

Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models

Add code
Apr 21, 2025
Viaarxiv icon

EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos

Add code
Apr 16, 2025
Viaarxiv icon

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

Add code
Apr 10, 2025
Viaarxiv icon

EagleVision: Object-level Attribute Multimodal LLM for Remote Sensing

Add code
Mar 30, 2025
Viaarxiv icon

Token-Efficient Long Video Understanding for Multimodal LLMs

Add code
Mar 06, 2025
Viaarxiv icon

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant

Add code
Mar 06, 2025
Viaarxiv icon

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

Add code
Mar 02, 2025
Viaarxiv icon