Picture for Joan Oliveras

Joan Oliveras

Data Driven Optimization of GPU efficiency for Distributed LLM Adapter Serving

Add code
Feb 27, 2026
Viaarxiv icon

Maximizing GPU Efficiency via Optimal Adapter Caching: An Analytical Approach for Multi-Tenant LLM Serving

Add code
Aug 11, 2025
Figure 1 for Maximizing GPU Efficiency via Optimal Adapter Caching: An Analytical Approach for Multi-Tenant LLM Serving
Figure 2 for Maximizing GPU Efficiency via Optimal Adapter Caching: An Analytical Approach for Multi-Tenant LLM Serving
Figure 3 for Maximizing GPU Efficiency via Optimal Adapter Caching: An Analytical Approach for Multi-Tenant LLM Serving
Figure 4 for Maximizing GPU Efficiency via Optimal Adapter Caching: An Analytical Approach for Multi-Tenant LLM Serving
Viaarxiv icon