Picture for Minyi Guo

Minyi Guo

Boosting Embodied AI Agents through Perception-Generation Disaggregation and Asynchronous Pipeline Execution

Add code
Sep 11, 2025
Viaarxiv icon

ClusterFusion: Expanding Operator Fusion Scope for LLM Inference via Cluster-Level Collective Primitive

Add code
Aug 26, 2025
Viaarxiv icon

Adacc: Adaptive Compression and Activation Checkpointing for LLM Memory Management

Add code
Aug 01, 2025
Viaarxiv icon

Efficient Serving of LLM Applications with Probabilistic Demand Modeling

Add code
Jun 17, 2025
Viaarxiv icon

Efficient Unified Caching for Accelerating Heterogeneous AI Workloads

Add code
Jun 14, 2025
Viaarxiv icon

STREAMINGGS: Voxel-Based Streaming 3D Gaussian Splatting with Memory Optimization and Architectural Support

Add code
Jun 09, 2025
Viaarxiv icon

Astraea: A GPU-Oriented Token-wise Acceleration Framework for Video Diffusion Transformers

Add code
Jun 06, 2025
Viaarxiv icon

An Efficient Private GPT Never Autoregressively Decodes

Add code
May 21, 2025
Viaarxiv icon

LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval

Add code
May 21, 2025
Viaarxiv icon

Communication-Efficient Diffusion Denoising Parallelization via Reuse-then-Predict Mechanism

Add code
May 20, 2025
Viaarxiv icon