Picture for Zihao Zheng

Zihao Zheng

Eric

Pyramid Forcing: Head-Aware Pyramid KV Cache Policy for High-Quality Long Video Generation

Add code
May 13, 2026
Viaarxiv icon

FreqCache: Accelerating Embodied VLN Models with Adaptive Frequency-Guided Token Caching

Add code
Apr 27, 2026
Viaarxiv icon

Ranking Abuse via Strategic Pairwise Data Perturbations

Add code
Apr 20, 2026
Viaarxiv icon

2D or 3D: Who Governs Salience in VLA Models? -- Tri-Stage Token Pruning Framework with Modality Salience Awareness

Add code
Apr 10, 2026
Viaarxiv icon

DIRECT: Video Mashup Creation via Hierarchical Multi-Agent Planning and Intent-Guided Editing

Add code
Apr 06, 2026
Viaarxiv icon

A Self-Rotating Tri-Rotor UAV for Field of View Expansion and Autonomous Flight

Add code
Mar 30, 2026
Viaarxiv icon

RoboECC: Multi-Factor-Aware Edge-Cloud Collaborative Deployment for VLA Models

Add code
Mar 21, 2026
Viaarxiv icon

HeiSD: Hybrid Speculative Decoding for Embodied Vision-Language-Action Models with Kinematic Awareness

Add code
Mar 18, 2026
Viaarxiv icon

CAST-TTS: A Simple Cross-Attention Framework for Unified Timbre Control in TTS

Add code
Mar 17, 2026
Viaarxiv icon

RAPID: Redundancy-Aware and Compatibility-Optimal Edge-Cloud Partitioned Inference for Diverse VLA Models

Add code
Mar 12, 2026
Viaarxiv icon