Picture for Yang Sui

Yang Sui

Henry

When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios

Add code
Jul 27, 2025
Viaarxiv icon

Multi-task Learning for Heterogeneous Data via Integrating Shared and Task-Specific Encodings

Add code
May 30, 2025
Viaarxiv icon

Multi-task Learning for Heterogeneous Multi-source Block-Wise Missing Data

Add code
May 30, 2025
Viaarxiv icon

AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models

Add code
May 28, 2025
Viaarxiv icon

HoliTom: Holistic Token Merging for Fast Video Large Language Models

Add code
May 28, 2025
Viaarxiv icon

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float

Add code
Apr 15, 2025
Viaarxiv icon

Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models

Add code
Mar 20, 2025
Figure 1 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Figure 2 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Figure 3 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Figure 4 for Plug-and-Play 1.x-Bit KV Cache Quantization for Video Large Language Models
Viaarxiv icon

Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models

Add code
Mar 20, 2025
Viaarxiv icon

Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization

Add code
Feb 06, 2025
Viaarxiv icon

Understanding Artificial Neural Network's Behavior from Neuron Activation Perspective

Add code
Dec 24, 2024
Viaarxiv icon