Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PerfDojo: Automated ML Library Generation for Heterogeneous Architectures

Nov 05, 2025

Andrei Ivanov, Siyuan Shen, Gioele Gottardo, Marcin Chrapek, Afif Boudaoud, Timo Schneider, Luca Benini, Torsten Hoefler

Figure 1 for PerfDojo: Automated ML Library Generation for Heterogeneous Architectures

Figure 2 for PerfDojo: Automated ML Library Generation for Heterogeneous Architectures

Figure 3 for PerfDojo: Automated ML Library Generation for Heterogeneous Architectures

Figure 4 for PerfDojo: Automated ML Library Generation for Heterogeneous Architectures

Share this with someone who'll enjoy it:

Abstract:The increasing complexity of machine learning models and the proliferation of diverse hardware architectures (CPUs, GPUs, accelerators) make achieving optimal performance a significant challenge. Heterogeneity in instruction sets, specialized kernel requirements for different data types and model features (e.g., sparsity, quantization), and architecture-specific optimizations complicate performance tuning. Manual optimization is resource-intensive, while existing automatic approaches often rely on complex hardware-specific heuristics and uninterpretable intermediate representations, hindering performance portability. We introduce PerfLLM, a novel automatic optimization methodology leveraging Large Language Models (LLMs) and Reinforcement Learning (RL). Central to this is PerfDojo, an environment framing optimization as an RL game using a human-readable, mathematically-inspired code representation that guarantees semantic validity through transformations. This allows effective optimization without prior hardware knowledge, facilitating both human analysis and RL agent training. We demonstrate PerfLLM's ability to achieve significant performance gains across diverse CPU (x86, Arm, RISC-V) and GPU architectures.

* The International Conference for High Performance Computing, Networking, Storage and Analysis (SC '25), November 16--21, 2025, St Louis, MO, USA

View paper on

Share this with someone who'll enjoy it:

Title:PerfDojo: Automated ML Library Generation for Heterogeneous Architectures

Paper and Code