Picture for Liang Lin

Liang Lin

Large Vision-Language Models Get Lost in Attention

Add code
May 07, 2026
Viaarxiv icon

Personalized Cross-Modal Emotional Correlation Learning for Speech-Preserving Facial Expression Manipulation

Add code
Apr 28, 2026
Viaarxiv icon

EgoLive: A Large-Scale Egocentric Dataset from Real-World Human Tasks

Add code
Apr 26, 2026
Viaarxiv icon

JoyAI-RA 0.1: A Foundation Model for Robotic Autonomy

Add code
Apr 22, 2026
Viaarxiv icon

Learning Spatial-Temporal Coherent Correlations for Speech-Preserving Facial Expression Manipulation

Add code
Apr 22, 2026
Viaarxiv icon

ProjLens: Unveiling the Role of Projectors in Multimodal Model Safety

Add code
Apr 21, 2026
Viaarxiv icon

Robotic Manipulation is Vision-to-Geometry Mapping ($f(v) \rightarrow G$): Vision-Geometry Backbones over Language and Video Models

Add code
Apr 14, 2026
Viaarxiv icon

Visually-Guided Policy Optimization for Multimodal Reasoning

Add code
Apr 10, 2026
Viaarxiv icon

Uncovering Linguistic Fragility in Vision-Language-Action Models via Diversity-Aware Red Teaming

Add code
Apr 07, 2026
Viaarxiv icon

Referring-Aware Visuomotor Policy Learning for Closed-Loop Manipulation

Add code
Apr 07, 2026
Viaarxiv icon