Picture for Peng Ye

Peng Ye

Think Twice, Act Once: Token-Aware Compression and Action Reuse for Efficient Inference in Vision-Language-Action Models

Add code
May 27, 2025
Viaarxiv icon

The Avengers: A Simple Recipe for Uniting Smaller Language Models to Challenge Proprietary Giants

Add code
May 26, 2025
Viaarxiv icon

Doc-CoB: Enhancing Multi-Modal Document Understanding with Visual Chain-of-Boxes Reasoning

Add code
May 24, 2025
Viaarxiv icon

NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification

Add code
May 22, 2025
Viaarxiv icon

Decouple and Orthogonalize: A Data-Free Framework for LoRA Merging

Add code
May 21, 2025
Viaarxiv icon

Dynamic Base model Shift for Delta Compression

Add code
May 16, 2025
Viaarxiv icon

When Dynamic Data Selection Meets Data Augmentation

Add code
May 02, 2025
Viaarxiv icon

Consistency-aware Self-Training for Iterative-based Stereo Matching

Add code
Mar 31, 2025
Viaarxiv icon

FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding

Add code
Mar 19, 2025
Viaarxiv icon

TokenCarve: Information-Preserving Visual Token Compression in Multimodal Large Language Models

Add code
Mar 13, 2025
Viaarxiv icon