Picture for Linfeng Zhang

Linfeng Zhang

Shanghai Jiaotong University

Are We Using the Right Benchmark: An Evaluation Framework for Visual Token Compression Methods

Add code
Oct 08, 2025
Viaarxiv icon

AudioMarathon: A Comprehensive Benchmark for Long-Context Audio Understanding and Efficiency in Audio LLMs

Add code
Oct 08, 2025
Viaarxiv icon

LMM-Incentive: Large Multimodal Model-based Incentive Design for User-Generated Content in Web 3.0

Add code
Oct 06, 2025
Viaarxiv icon

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Add code
Sep 26, 2025
Viaarxiv icon

PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era

Add code
Sep 16, 2025
Viaarxiv icon

SageLM: A Multi-aspect and Explainable Large Language Model for Speech Judgement

Add code
Aug 28, 2025
Viaarxiv icon

HiCache: Training-free Acceleration of Diffusion Models via Hermite Polynomial-based Feature Caching

Add code
Aug 23, 2025
Viaarxiv icon

Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control

Add code
Aug 12, 2025
Figure 1 for Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control
Figure 2 for Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control
Figure 3 for Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control
Figure 4 for Follow-Your-Shape: Shape-Aware Image Editing via Trajectory-Guided Region Control
Viaarxiv icon

Accelerating Diffusion Large Language Models with SlowFast: The Three Golden Principles

Add code
Jun 12, 2025
Viaarxiv icon

SkipVAR: Accelerating Visual Autoregressive Modeling via Adaptive Frequency-Aware Skipping

Add code
Jun 11, 2025
Viaarxiv icon