Abstract:Training large-scale machine learning models incurs substantial carbon emissions. Federated Learning (FL), by distributing computation across geographically dispersed clients, offers a natural framework to leverage regional and temporal variations in Carbon Intensity (CI). This paper investigates how to reduce emissions in FL through carbon-aware client selection and training scheduling. We first quantify the emission savings of a carbon-aware scheduling policy that leverages slack time -- permitting a modest extension of the training duration so that clients can defer local training rounds to lower-carbon periods. We then examine the performance trade-offs of such scheduling which stem from statistical heterogeneity among clients, selection bias in participation, and temporal correlation in model updates. To leverage these trade-offs, we construct a carbon-aware scheduler that integrates slack time, $\alpha$-fair carbon allocation, and a global fine-tuning phase. Experiments on real-world CI data show that our scheduler outperforms slack-agnostic baselines, achieving higher model accuracy across a wide range of carbon budgets, with especially strong gains under tight carbon constraints.
Abstract:Monitoring, understanding, and optimizing the energy consumption of Machine Learning (ML) are various reasons why it is necessary to evaluate the energy usage of ML. However, there exists no universal tool that can answer this question for all use cases, and there may even be disagreement on how to evaluate energy consumption for a specific use case. Tools and methods are based on different approaches, each with their own advantages and drawbacks, and they need to be mapped out and explained in order to select the most suitable one for a given situation. We address this challenge through two approaches. First, we conduct a systematic literature review of all tools and methods that permit to evaluate the energy consumption of ML (both at training and at inference), irrespective of whether they were originally designed for machine learning or general software. Second, we develop and use an experimental protocol to compare a selection of these tools and methods. The comparison is both qualitative and quantitative on a range of ML tasks of different nature (vision, language) and computational complexity. The systematic literature review serves as a comprehensive guide for understanding the array of tools and methods used in evaluating energy consumption of ML, for various use cases going from basic energy monitoring to consumption optimization. Two open-source repositories are provided for further exploration. The first one contains tools that can be used to replicate this work or extend the current review. The second repository houses the experimental protocol, allowing users to augment the protocol with new ML computing tasks and additional energy evaluation tools.