Picture for Zhaode Wang

Zhaode Wang

AccKV: Towards Efficient Audio-Video LLMs Inference via Adaptive-Focusing and Cross-Calibration KV Cache Optimization

Add code
Nov 14, 2025
Viaarxiv icon

MNN-LLM: A Generic Inference Engine for Fast Large Language Model Deployment on Mobile Devices

Add code
Jun 12, 2025
Viaarxiv icon

Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning

Add code
May 30, 2022
Figure 1 for Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
Figure 2 for Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
Figure 3 for Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
Figure 4 for Walle: An End-to-End, General-Purpose, and Large-Scale Production System for Device-Cloud Collaborative Machine Learning
Viaarxiv icon