Picture for Minyi Guo

Minyi Guo

Efficient Serving of LLM Applications with Probabilistic Demand Modeling

Add code
Jun 17, 2025
Viaarxiv icon

Efficient Unified Caching for Accelerating Heterogeneous AI Workloads

Add code
Jun 14, 2025
Viaarxiv icon

STREAMINGGS: Voxel-Based Streaming 3D Gaussian Splatting with Memory Optimization and Architectural Support

Add code
Jun 09, 2025
Viaarxiv icon

Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers

Add code
Jun 06, 2025
Viaarxiv icon

An Efficient Private GPT Never Autoregressively Decodes

Add code
May 21, 2025
Viaarxiv icon

LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval

Add code
May 21, 2025
Viaarxiv icon

Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism

Add code
May 20, 2025
Viaarxiv icon

FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference

Add code
May 19, 2025
Viaarxiv icon

Saga: Capturing Multi-granularity Semantics from Massive Unlabelled IMU Data for User Perception

Add code
Apr 16, 2025
Viaarxiv icon

Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception

Add code
Jan 03, 2025
Figure 1 for Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception
Figure 2 for Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception
Figure 3 for Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception
Figure 4 for Prism: Mining Task-aware Domains in Non-i.i.d. IMU Data for Flexible User Perception
Viaarxiv icon