Picture for Mao Yang

Mao Yang

SeerAttention: Learning Intrinsic Sparse Attention in Your LLMs

Add code
Oct 17, 2024
Viaarxiv icon

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Viaarxiv icon

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Add code
Aug 12, 2024
Viaarxiv icon

LUT Tensor Core: Lookup Table Enables Efficient Low-Bit LLM Inference Acceleration

Add code
Aug 12, 2024
Viaarxiv icon

T-MAC: CPU Renaissance via Table Lookup for Low-Bit LLM Deployment on Edge

Add code
Jun 25, 2024
Viaarxiv icon

MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels

Add code
May 13, 2024
Viaarxiv icon

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Add code
Feb 21, 2024
Viaarxiv icon

Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning

Add code
Dec 26, 2023
Viaarxiv icon

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models

Add code
Oct 11, 2023
Viaarxiv icon

Model-enhanced Vector Index

Add code
Sep 23, 2023
Viaarxiv icon