Picture for Zhihui Li

Zhihui Li

Progressive Online Video Understanding with Evidence-Aligned Timing and Transparent Decisions

Add code
Apr 20, 2026
Viaarxiv icon

LatentPilot: Scene-Aware Vision-and-Language Navigation by Dreaming Ahead with Latent Visual Reasoning

Add code
Mar 31, 2026
Viaarxiv icon

RiskProp: Collision-Anchored Self-Supervised Risk Propagation for Early Accident Anticipation

Add code
Mar 28, 2026
Viaarxiv icon

Beyond Dense Futures: World Models as Structured Planners for Robotic Manipulation

Add code
Mar 13, 2026
Viaarxiv icon

See, Plan, Rewind: Progress-Aware Vision-Language-Action Models for Robust Robotic Manipulation

Add code
Mar 10, 2026
Viaarxiv icon

CoNav: Collaborative Cross-Modal Reasoning for Embodied Navigation

Add code
May 22, 2025
Viaarxiv icon

FANeRV: Frequency Separation and Augmentation based Neural Representation for Video

Add code
Apr 09, 2025
Viaarxiv icon

Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation

Add code
Mar 10, 2025
Figure 1 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 2 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 3 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Figure 4 for Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation
Viaarxiv icon

High-Frequency Enhanced Hybrid Neural Representation for Video Compression

Add code
Nov 11, 2024
Viaarxiv icon

Unsupervised Visible-Infrared Person ReID by Collaborative Learning with Neighbor-Guided Label Refinement

Add code
May 22, 2023
Viaarxiv icon