Picture for Sai Qian Zhang

Sai Qian Zhang

CodeQuant: Unified Clustering and Quantization for Enhanced Outlier Smoothing in Low-Precision Mixture-of-Experts

Add code
Apr 12, 2026
Viaarxiv icon

EgoEverything: A Benchmark for Human Behavior Inspired Long Context Egocentric Video Understanding in AR Environment

Add code
Apr 09, 2026
Viaarxiv icon

WSVD: Weighted Low-Rank Approximation for Fast and Efficient Execution of Low-Precision Vision-Language Models

Add code
Apr 02, 2026
Viaarxiv icon

pFedNavi: Structure-Aware Personalized Federated Vision-Language Navigation for Embodied AI

Add code
Feb 16, 2026
Viaarxiv icon

Rethinking the Outlier Distribution in Large Language Models: An In-depth Study

Add code
May 27, 2025
Figure 1 for Rethinking the Outlier Distribution in Large Language Models: An In-depth Study
Figure 2 for Rethinking the Outlier Distribution in Large Language Models: An In-depth Study
Figure 3 for Rethinking the Outlier Distribution in Large Language Models: An In-depth Study
Figure 4 for Rethinking the Outlier Distribution in Large Language Models: An In-depth Study
Viaarxiv icon

DREAM: Drafting with Refined Target Features and Entropy-Adaptive Cross-Attention Fusion for Multimodal Speculative Decoding

Add code
May 25, 2025
Viaarxiv icon

PipeSpec: Breaking Stage Dependencies in Hierarchical LLM Decoding

Add code
May 02, 2025
Viaarxiv icon

Foveated Instance Segmentation

Add code
Mar 27, 2025
Figure 1 for Foveated Instance Segmentation
Figure 2 for Foveated Instance Segmentation
Figure 3 for Foveated Instance Segmentation
Figure 4 for Foveated Instance Segmentation
Viaarxiv icon

Speculative Decoding and Beyond: An In-Depth Review of Techniques

Add code
Feb 27, 2025
Viaarxiv icon

Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference

Add code
Feb 02, 2025
Figure 1 for Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference
Figure 2 for Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference
Figure 3 for Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference
Figure 4 for Huff-LLM: End-to-End Lossless Compression for Efficient LLM Inference
Viaarxiv icon