Autoscaling


SAIR: Cost-Efficient Multi-Stage ML Pipeline Autoscaling via In-Context Reinforcement Learning

Add code
Jan 29, 2026
Viaarxiv icon

WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving

Add code
Dec 10, 2025
Figure 1 for WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving
Figure 2 for WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving
Figure 3 for WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving
Figure 4 for WarmServe: Enabling One-for-Many GPU Prewarming for Multi-LLM Serving
Viaarxiv icon

Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices

Add code
Oct 08, 2025
Figure 1 for Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices
Figure 2 for Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices
Figure 3 for Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices
Figure 4 for Multi-Dimensional Autoscaling of Stream Processing Services on Edge Devices
Viaarxiv icon

Insights from Gradient Dynamics: Gradient Autoscaled Normalization

Add code
Sep 03, 2025
Viaarxiv icon

Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference

Add code
Aug 27, 2025
Figure 1 for Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference
Figure 2 for Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference
Figure 3 for Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference
Figure 4 for Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference
Viaarxiv icon

Multi-dimensional Autoscaling of Processing Services: A Comparison of Agent-based Methods

Add code
Jun 12, 2025
Viaarxiv icon

Streamlining Resilient Kubernetes Autoscaling with Multi-Agent Systems via an Automated Online Design Framework

Add code
May 26, 2025
Viaarxiv icon

Learning in Chaos: Efficient Autoscaling and Self-healing for Distributed Training at the Edge

Add code
May 19, 2025
Viaarxiv icon

Scalability Optimization in Cloud-Based AI Inference Services: Strategies for Real-Time Load Balancing and Automated Scaling

Add code
Apr 16, 2025
Viaarxiv icon

OVERLORD: Ultimate Scaling of DataLoader for Multi-Source Large Foundation Model Training

Add code
Apr 14, 2025
Viaarxiv icon