Picture for Zichen Wen

Zichen Wen

D2Pruner: Debiased Importance and Structural Diversity for MLLM Token Pruning

Add code
Dec 22, 2025
Viaarxiv icon

IPCV: Information-Preserving Compression for MLLM Visual Encoders

Add code
Dec 21, 2025
Viaarxiv icon

DOCR-Inspector: Fine-Grained and Automated Evaluation of Document Parsing with VLM

Add code
Dec 11, 2025
Viaarxiv icon

OmniLayout: Enabling Coarse-to-Fine Learning with LLMs for Universal Document Layout Generation

Add code
Oct 30, 2025
Viaarxiv icon

AI for Service: Proactive Assistance with AI Glasses

Add code
Oct 16, 2025
Viaarxiv icon

ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution

Add code
Oct 14, 2025
Figure 1 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Figure 2 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Figure 3 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Figure 4 for ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution
Viaarxiv icon

AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs

Add code
Oct 08, 2025
Figure 1 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 2 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 3 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Figure 4 for AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs
Viaarxiv icon

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods

Add code
Oct 08, 2025
Viaarxiv icon

DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation

Add code
Aug 06, 2025
Figure 1 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Figure 2 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Figure 3 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Figure 4 for DTPA: Dynamic Token-level Prefix Augmentation for Controllable Text Generation
Viaarxiv icon

EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models

Add code
Jun 11, 2025
Figure 1 for EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Figure 2 for EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Figure 3 for EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Figure 4 for EfficientVLA: Training-Free Acceleration and Compression for Vision-Language-Action Models
Viaarxiv icon