Picture for Jinlan Fu

Jinlan Fu

HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

Add code
Jan 21, 2026
Viaarxiv icon

FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs

Add code
Jan 20, 2026
Viaarxiv icon

Analyzing Reasoning Consistency in Large Multimodal Models under Cross-Modal Conflicts

Add code
Jan 07, 2026
Viaarxiv icon

AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

Add code
Dec 29, 2025
Viaarxiv icon

MCM-DPO: Multifaceted Cross-Modal Direct Preference Optimization for Alt-text Generation

Add code
Oct 01, 2025
Viaarxiv icon

LLM as Effective Streaming Processor: Bridging Streaming-Batch Mismatches with Group Position Encoding

Add code
May 22, 2025
Viaarxiv icon

Investigating and Enhancing the Robustness of Large Multimodal Models Against Temporal Inconsistency

Add code
May 20, 2025
Viaarxiv icon

Rethinking Visual Layer Selection in Multimodal LLMs

Add code
Apr 30, 2025
Viaarxiv icon

VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models

Add code
Apr 17, 2025
Figure 1 for VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
Figure 2 for VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
Figure 3 for VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
Figure 4 for VistaDPO: Video Hierarchical Spatial-Temporal Direct Preference Optimization for Large Video Models
Viaarxiv icon

World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning

Add code
Mar 13, 2025
Viaarxiv icon