Picture for Kunxiong Zhu

Kunxiong Zhu

FlashMem: Supporting Modern DNN Workloads on Mobile with GPU Memory Hierarchy Optimizations

Add code
Feb 17, 2026
Viaarxiv icon

From Bits to Chips: An LLM-based Hardware-Aware Quantization Agent for Streamlined Deployment of LLMs

Add code
Jan 07, 2026
Viaarxiv icon