With the ongoing energy transition, demand-side flexibility has become an important aspect of the modern power grid for providing grid support and allowing further integration of sustainable energy sources. Besides traditional sources, the residential sector is another major and largely untapped source of flexibility, driven by the increased adoption of solar PV, home batteries, and EVs. However, unlocking this residential flexibility is challenging as it requires a control framework that can effectively manage household energy consumption, and maintain user comfort while being readily scalable across different, diverse houses. We aim to address this challenging problem and introduce a reinforcement learning-based approach using differentiable decision trees. This approach integrates the scalability of data-driven reinforcement learning with the explainability of (differentiable) decision trees. This leads to a controller that can be easily adapted across different houses and provides a simple control policy that can be explained to end-users, further improving user acceptance. As a proof-of-concept, we analyze our method using a home energy management problem, comparing its performance with commercially available rule-based baseline and standard neural network-based RL controllers. Through this preliminary study, we show that the performance of our proposed method is comparable to standard RL-based controllers, outperforming baseline controllers by ~20% in terms of daily cost savings while being straightforward to explain.
Demand-side flexibility is gaining importance as a crucial element in the energy transition process. Accounting for about 25% of final energy consumption globally, the residential sector is an important (potential) source of energy flexibility. However, unlocking this flexibility requires developing a control framework that (1) easily scales across different houses, (2) is easy to maintain, and (3) is simple to understand for end-users. A potential control framework for such a task is data-driven control, specifically model-free reinforcement learning (RL). Such RL-based controllers learn a good control policy by interacting with their environment, learning purely based on data and with minimal human intervention. Yet, they lack explainability, which hampers user acceptance. Moreover, limited hardware capabilities of residential assets forms a hurdle (e.g., using deep neural networks). To overcome both those challenges, we propose a novel method to obtain explainable RL policies by using differentiable decision trees. Using a policy distillation approach, we train these differentiable decision trees to mimic standard RL-based controllers, leading to a decision tree-based control policy that is data-driven and easy to explain. As a proof-of-concept, we examine the performance and explainability of our proposed approach in a battery-based home energy management system to reduce energy costs. For this use case, we show that our proposed approach can outperform baseline rule-based policies by about 20-25%, while providing simple, explainable control policies. We further compare these explainable policies with standard RL policies and examine the performance trade-offs associated with this increased explainability.
Multi-label classification problems with thousands of classes are hard to solve with in-context learning alone, as language models (LMs) might lack prior knowledge about the precise classes or how to assign them, and it is generally infeasible to demonstrate every class in a prompt. We propose a general program, $\texttt{Infer--Retrieve--Rank}$, that defines multi-step interactions between LMs and retrievers to efficiently tackle such problems. We implement this program using the $\texttt{DSPy}$ programming model, which specifies in-context systems in a declarative manner, and use $\texttt{DSPy}$ optimizers to tune it towards specific datasets by bootstrapping only tens of few-shot examples. Our primary extreme classification program, optimized separately for each task, attains state-of-the-art results across three benchmarks (HOUSE, TECH, TECHWOLF). We apply the same program to a benchmark with vastly different characteristics and attain competitive performance as well (BioDEX). Unlike prior work, our proposed solution requires no finetuning, is easily applicable to new tasks, alleviates prompt engineering, and requires only tens of labeled examples. Our code is public at https://github.com/KarelDO/xmc.dspy.
Growth in the penetration of renewable energy sources makes supply more uncertain and leads to an increase in the system imbalance. This trend, together with the single imbalance pricing, opens an opportunity for balance responsible parties (BRPs) to perform energy arbitrage in the imbalance settlement mechanism. To this end, we propose a battery control framework based on distributional reinforcement learning (DRL). Our proposed control framework takes a risk-sensitive perspective, allowing BRPs to adjust their risk preferences: we aim to optimize a weighted sum of the arbitrage profit and a risk measure while constraining the daily number of cycles for the battery. We assess the performance of our proposed control framework using the Belgian imbalance prices of 2022 and compare two state-of-the-art RL methods, deep Q learning and soft actor-critic. Results reveal that the distributional soft actor-critic method can outperform other methods. Moreover, we note that our fully risk-averse agent appropriately learns to hedge against the risk related to the unknown imbalance price by (dis)charging the battery only when the agent is more certain about the price.
Controlling energy consumption in buildings through demand response (DR) has become increasingly important to reduce global carbon emissions and limit climate change. In this paper, we specifically focus on controlling the heating system of a residential building to optimize its energy consumption while respecting user's thermal comfort. Recent works in this area have mainly focused on either model-based control, e.g., model predictive control (MPC), or model-free reinforcement learning (RL) to implement practical DR algorithms. A specific RL method that recently has achieved impressive success in domains such as board games (go, chess) is Monte Carlo Tree Search (MCTS). Yet, for building control it has remained largely unexplored. Thus, we study MCTS specifically for building demand response. Its natural structure allows a flexible optimization that implicitly integrate exogenous constraints (as opposed, for example, to conventional RL solutions), making MCTS a promising candidate for DR control problems. We demonstrate how to improve MCTS control performance by incorporating a Physics-informed Neural Network (PiNN) model for its underlying thermal state prediction, as opposed to traditional purely data-driven Black-Box approaches. Our MCTS implementation aligned with a PiNN model is able to obtain a 3% increment of the obtained reward compared to a rule-based controller; leading to a 10% cost reduction and 35% reduction on temperature difference with the desired one when applied to an artificial price profile. We further implemented a Deep Learning layer into the Monte Carlo Tree Search technique using a neural network that leads the tree search through more optimal nodes. We then compared this addition with its Vanilla version, showing the improvement in computational cost required.
Model interpretability and model editing are crucial goals in the age of large language models. Interestingly, there exists a link between these two goals: if a method is able to systematically edit model behavior with regard to a human concept of interest, this editor method can help make internal representations more interpretable by pointing towards relevant representations and systematically manipulating them.
The brittleness of finetuned language model performance on out-of-distribution (OOD) test samples in unseen domains has been well-studied for English, yet is unexplored for multi-lingual models. Therefore, we study generalization to OOD test data specifically in zero-shot cross-lingual transfer settings, analyzing performance impacts of both language and domain shifts between train and test data. We further assess the effectiveness of counterfactually augmented data (CAD) in improving OOD generalization for the cross-lingual setting, since CAD has been shown to benefit in a monolingual English setting. Finally, we propose two new approaches for OOD generalization that avoid the costly annotation process associated with CAD, by exploiting the power of recent large language models (LLMs). We experiment with 3 multilingual models, LaBSE, mBERT, and XLM-R trained on English IMDb movie reviews, and evaluate on OOD test sets in 13 languages: Amazon product reviews, Tweets, and Restaurant reviews. Results echo the OOD performance decline observed in the monolingual English setting. Further, (i) counterfactuals from the original high-resource language do improve OOD generalization in the low-resource language, and (ii) our newly proposed cost-effective approaches reach similar or up to +3.1% better accuracy than CAD for Amazon and Restaurant reviews.
In disentangling the heterogeneity observed in psychopathology, personality of the patients is considered crucial. While it has been demonstrated that personality traits are reflected in the language used by a patient, we hypothesize that this enables automatic inference of the personality type directly from speech utterances, potentially more accurately than through a traditional questionnaire-based approach explicitly designed for personality classification. To validate this hypothesis, we adopt natural language processing (NLP) and standard machine learning tools for classification. We test this on a dataset of recorded clinical diagnostic interviews (CDI) on a sample of 79 patients diagnosed with major depressive disorder (MDD) -- a condition for which differentiated treatment based on personality styles has been advocated -- and classified into anaclitic and introjective personality styles. We start by analyzing the interviews to see which linguistic features are associated with each style, in order to gain a better understanding of the styles. Then, we develop automatic classifiers based on (a) standardized questionnaire responses; (b) basic text features, i.e., TF-IDF scores of words and word sequences; (c) more advanced text features, using LIWC (linguistic inquiry and word count) and context-aware features using BERT (bidirectional encoder representations from transformers); (d) audio features. We find that automated classification with language-derived features (i.e., based on LIWC) significantly outperforms questionnaire-based classification models. Furthermore, the best performance is achieved by combining LIWC with the questionnaire features. This suggests that more work should be put into developing linguistically based automated techniques for characterizing personality, however questionnaires still to some extent complement such methods.
Increasingly, homeowners opt for photovoltaic (PV) systems and/or battery storage to minimize their energy bills and maximize renewable energy usage. This has spurred the development of advanced control algorithms that maximally achieve those goals. However, a common challenge faced while developing such controllers is the unavailability of accurate forecasts of household power consumption, especially for shorter time resolutions (15 minutes) and in a data-efficient manner. In this paper, we analyze how transfer learning can help by exploiting data from multiple households to improve a single house's load forecasting. Specifically, we train an advanced forecasting model (a temporal fusion transformer) using data from multiple different households, and then finetune this global model on a new household with limited data (i.e. only a few days). The obtained models are used for forecasting power consumption of the household for the next 24 hours~(day-ahead) at a time resolution of 15 minutes, with the intention of using these forecasts in advanced controllers such as Model Predictive Control. We show the benefit of this transfer learning setup versus solely using the individual new household's data, both in terms of (i) forecasting accuracy ($\sim$15\% MAE reduction) and (ii) control performance ($\sim$2\% energy cost reduction), using real-world household data.
Given its substantial contribution of 40\% to global power consumption, the built environment has received increasing attention to serve as a source of flexibility to assist the modern power grid. In that respect, previous research mainly focused on energy management of individual buildings. In contrast, in this paper, we focus on aggregated control of a set of residential buildings, to provide grid supporting services, that eventually should include ancillary services. In particular, we present a real-life pilot study that studies the effectiveness of reinforcement-learning (RL) in coordinating the power consumption of 8 residential buildings to jointly track a target power signal. Our RL approach relies solely on observed data from individual households and does not require any explicit building models or simulators, making it practical to implement and easy to scale. We show the feasibility of our proposed RL-based coordination strategy in a real-world setting. In a 4-week case study, we demonstrate a hierarchical control system, relying on an RL-based ranking system to select which households to activate flex assets from, and a real-time PI control-based power dispatch mechanism to control the selected assets. Our results demonstrate satisfactory power tracking, and the effectiveness of the RL-based ranks which are learnt in a purely data-driven manner.