Picture for Zhenhua Dong

Zhenhua Dong

HAWK: Head Importance-Aware Visual Token Pruning in Multimodal Models

Add code
Apr 09, 2026
Viaarxiv icon

BATQuant: Outlier-resilient MXFP4 Quantization via Learnable Block-wise Optimization

Add code
Mar 17, 2026
Viaarxiv icon

FairFS: Addressing Deep Feature Selection Biases for Recommender System

Add code
Feb 23, 2026
Viaarxiv icon

Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats

Add code
Feb 13, 2026
Viaarxiv icon

MALLOC: Benchmarking the Memory-aware Long Sequence Compression for Large Sequential Recommendation

Add code
Jan 29, 2026
Viaarxiv icon

Length-Adaptive Interest Network for Balancing Long and Short Sequence Modeling in CTR Prediction

Add code
Jan 27, 2026
Viaarxiv icon

Benchmarking Post-Training Quantization of Large Language Models under Microscaling Floating Point Formats

Add code
Jan 14, 2026
Viaarxiv icon

SwiftMem: Fast Agentic Memory via Query-aware Indexing

Add code
Jan 13, 2026
Viaarxiv icon

What Matters For Safety Alignment?

Add code
Jan 07, 2026
Viaarxiv icon

Towards Efficient Agents: A Co-Design of Inference Architecture and System

Add code
Dec 20, 2025
Viaarxiv icon