Picture for Li Lyna Zhang

Li Lyna Zhang

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Add code
Sep 25, 2024
Viaarxiv icon

Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solvers

Add code
Aug 12, 2024
Viaarxiv icon

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Add code
Apr 23, 2024
Viaarxiv icon

LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Add code
Feb 21, 2024
Viaarxiv icon

Boosting LLM Reasoning: Push the Limits of Few-shot Learning with Reinforced In-Context Pruning

Add code
Dec 26, 2023
Viaarxiv icon

Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models

Add code
Oct 11, 2023
Viaarxiv icon

Constraint-aware and Ranking-distilled Token Pruning for Efficient Transformer Inference

Add code
Jun 26, 2023
Viaarxiv icon

Accurate and Structured Pruning for Efficient Automatic Speech Recognition

Add code
May 31, 2023
Viaarxiv icon

ElasticViT: Conflict-aware Supernet Training for Deploying Fast Vision Transformer on Diverse Mobile Devices

Add code
Mar 21, 2023
Viaarxiv icon

SpaceEvo: Hardware-Friendly Search Space Design for Efficient INT8 Inference

Add code
Mar 15, 2023
Viaarxiv icon