Picture for Zhihang Yuan

Zhihang Yuan

PCDVQ: Enhancing Vector Quantization for Large Language Models via Polar Coordinate Decoupling

Add code
Jun 05, 2025
Viaarxiv icon

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Add code
May 27, 2025
Viaarxiv icon

MxMoE: Mixed-precision Quantization for MoE with Accuracy and Performance Co-Design

Add code
May 09, 2025
Viaarxiv icon

RWKVQuant: Quantizing the RWKV Family with Proxy Guided Hybrid of Scalar and Vector Quantization

Add code
May 02, 2025
Viaarxiv icon

MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance

Add code
May 02, 2025
Viaarxiv icon

VGDFR: Diffusion-based Video Generation with Dynamic Latent Frame Rate

Add code
Apr 16, 2025
Viaarxiv icon

DiTFastAttnV2: Head-wise Attention Compression for Multi-Modality Diffusion Transformers

Add code
Mar 28, 2025
Viaarxiv icon

GSQ-Tuning: Group-Shared Exponents Integer in Fully Quantized Training for LLMs On-Device Fine-tuning

Add code
Feb 18, 2025
Viaarxiv icon

DLFR-VAE: Dynamic Latent Frame Rate VAE for Video Generation

Add code
Feb 17, 2025
Viaarxiv icon

E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling

Add code
Dec 19, 2024
Figure 1 for E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Figure 2 for E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Figure 3 for E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Figure 4 for E-CAR: Efficient Continuous Autoregressive Image Generation via Multistage Modeling
Viaarxiv icon