Picture for Dawei Yang

Dawei Yang

TWLA: Achieving Ternary Weights and Low-Bit Activations for LLMs via Post-Training Quantization

Add code
Jun 11, 2026
Viaarxiv icon

TORQ: Two-Level Orthogonal Rotation for MXFP4 Quantization

Add code
May 19, 2026
Viaarxiv icon

R3-VAE: Reference Vector-Guided Rating Residual Quantization VAE for Generative Recommendation

Add code
Apr 14, 2026
Viaarxiv icon

MoBiE: Efficient Inference of Mixture of Binary Experts under Post-Training Quantization

Add code
Apr 08, 2026
Viaarxiv icon

SAES-SVD: Self-Adaptive Suppression of Accumulated and Local Errors for SVD-based LLM Compression

Add code
Feb 03, 2026
Viaarxiv icon

NLI:Non-uniform Linear Interpolation Approximation of Nonlinear Operations for Efficient LLMs Inference

Add code
Feb 03, 2026
Viaarxiv icon

OTARo: Once Tuning for All Precisions toward Robust On-Device LLMs

Add code
Nov 17, 2025
Viaarxiv icon

FQ-PETR: Fully Quantized Position Embedding Transformation for Multi-View 3D Object Detection

Add code
Nov 14, 2025
Viaarxiv icon

VAEVQ: Enhancing Discrete Visual Tokenization through Variational Modeling

Add code
Nov 10, 2025
Viaarxiv icon

PCDVQ: Enhancing Vector Quantization for Large Language Models via Polar Coordinate Decoupling

Add code
Jun 05, 2025
Viaarxiv icon