Picture for Christos Kozyrakis

Christos Kozyrakis

Improving Efficiency of GPU Kernel Optimization Agents using a Domain-Specific Language and Speed-of-Light Guidance

Add code
Mar 30, 2026
Viaarxiv icon

SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits

Add code
Mar 19, 2026
Viaarxiv icon

AI+HW 2035: Shaping the Next Decade

Add code
Mar 05, 2026
Viaarxiv icon

KernelBlaster: Continual Cross-Task CUDA Optimization via Memory-Augmented In-Context Reinforcement Learning

Add code
Feb 15, 2026
Viaarxiv icon

Accelerating Mixture-of-Experts Training with Adaptive Expert Replication

Add code
Apr 28, 2025
Figure 1 for Accelerating Mixture-of-Experts Training with Adaptive Expert Replication
Figure 2 for Accelerating Mixture-of-Experts Training with Adaptive Expert Replication
Figure 3 for Accelerating Mixture-of-Experts Training with Adaptive Expert Replication
Figure 4 for Accelerating Mixture-of-Experts Training with Adaptive Expert Replication
Viaarxiv icon

Efficient GNN Training Through Structure-Aware Randomized Mini-Batching

Add code
Apr 25, 2025
Figure 1 for Efficient GNN Training Through Structure-Aware Randomized Mini-Batching
Figure 2 for Efficient GNN Training Through Structure-Aware Randomized Mini-Batching
Figure 3 for Efficient GNN Training Through Structure-Aware Randomized Mini-Batching
Figure 4 for Efficient GNN Training Through Structure-Aware Randomized Mini-Batching
Viaarxiv icon

AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution

Add code
Nov 05, 2024
Figure 1 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Figure 2 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Figure 3 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Figure 4 for AI Metropolis: Scaling Large Language Model-based Multi-Agent Simulation with Out-of-order Execution
Viaarxiv icon

Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight

Add code
Jul 11, 2024
Viaarxiv icon

SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures

Add code
May 22, 2024
Figure 1 for SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures
Figure 2 for SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures
Figure 3 for SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures
Figure 4 for SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures
Viaarxiv icon

cedar: Composable and Optimized Machine Learning Input Data Pipelines

Add code
Jan 25, 2024
Figure 1 for cedar: Composable and Optimized Machine Learning Input Data Pipelines
Figure 2 for cedar: Composable and Optimized Machine Learning Input Data Pipelines
Figure 3 for cedar: Composable and Optimized Machine Learning Input Data Pipelines
Figure 4 for cedar: Composable and Optimized Machine Learning Input Data Pipelines
Viaarxiv icon