Picture for Zhihao Mao

Zhihao Mao

FreqCache: Accelerating Embodied VLN Models with Adaptive Frequency-Guided Token Caching

Add code
Apr 27, 2026
Viaarxiv icon

2D or 3D: Who Governs Salience in VLA Models? -- Tri-Stage Token Pruning Framework with Modality Salience Awareness

Add code
Apr 10, 2026
Viaarxiv icon

RAP: Retrieve, Adapt, and Prompt-Fit for Training-Free Few-Shot Medical Image Segmentation

Add code
Mar 29, 2026
Viaarxiv icon

HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness

Add code
Mar 18, 2026
Viaarxiv icon

KERV: Kinematic-Rectified Speculative Decoding for Embodied VLA Models

Add code
Mar 02, 2026
Viaarxiv icon