Picture for Wen Xiao

Wen Xiao

University of British Columbia

R-KV: Redundancy-aware KV Cache Compression for Training-Free Reasoning Models Acceleration

Add code
May 30, 2025
Viaarxiv icon

VisualToolAgent (VisTA): A Reinforcement Learning Framework for Visual Tool Selection

Add code
May 26, 2025
Viaarxiv icon

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Add code
Feb 18, 2025
Viaarxiv icon

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Add code
Dec 30, 2024
Figure 1 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 2 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 3 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 4 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Viaarxiv icon

Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning

Add code
Oct 28, 2024
Figure 1 for Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Figure 2 for Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Figure 3 for Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Figure 4 for Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Viaarxiv icon

Integrative Decoding: Improve Factuality via Implicit Self-consistency

Add code
Oct 02, 2024
Viaarxiv icon

Towards a Unified View of Preference Learning for Large Language Models: A Survey

Add code
Sep 04, 2024
Figure 1 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Figure 2 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Figure 3 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Figure 4 for Towards a Unified View of Preference Learning for Large Language Models: A Survey
Viaarxiv icon

LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback

Add code
Jun 30, 2024
Figure 1 for LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Figure 2 for LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Figure 3 for LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Figure 4 for LLM Critics Help Catch Bugs in Mathematics: Towards a Better Mathematical Verifier with Natural Language Feedback
Viaarxiv icon

PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling

Add code
Jun 04, 2024
Figure 1 for PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Figure 2 for PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Figure 3 for PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Figure 4 for PyramidKV: Dynamic KV Cache Compression based on Pyramidal Information Funneling
Viaarxiv icon

Cross-Task Defense: Instruction-Tuning LLMs for Content Safety

Add code
May 24, 2024
Figure 1 for Cross-Task Defense: Instruction-Tuning LLMs for Content Safety
Figure 2 for Cross-Task Defense: Instruction-Tuning LLMs for Content Safety
Figure 3 for Cross-Task Defense: Instruction-Tuning LLMs for Content Safety
Figure 4 for Cross-Task Defense: Instruction-Tuning LLMs for Content Safety
Viaarxiv icon