Database knob tuning is a critical challenge in the database community, aiming to optimize knob values to enhance database performance for specific workloads. DBMS often feature hundreds of tunable knobs, posing a significant challenge for DBAs to recommend optimal configurations. Consequently, many machine learning-based tuning methods have been developed to automate this process. Despite the introduction of various optimizers, practical applications have unveiled a new problem: they typically require numerous workload runs to achieve satisfactory performance, a process that is both time-consuming and resource-intensive. This inefficiency largely stems from the optimal configuration often being substantially different from the default setting, necessitating multiple iterations during tuning. Recognizing this, we argue that an effective starting point could significantly reduce redundant exploration in less efficient areas, thereby potentially speeding up the tuning process for the optimizers. Based on this assumption, we introduce LLMTune, a large language model-based configuration generator designed to produce an initial, high-quality configuration for new workloads. These generated configurations can then serve as starting points for various base optimizers, accelerating their tuning processes. To obtain training data for LLMTune's supervised fine-tuning, we have devised a new automatic data generation framework capable of efficiently creating a large number of <workload, configuration> pairs. We have conducted thorough experiments to evaluate LLMTune's effectiveness with different workloads, such as TPC-H and JOB. In comparison to leading methods, LLMTune demonstrates a quicker ability to identify superior configurations. For instance, with the challenging TPC-H workload, our LLMTune achieves a significant 15.6x speed-up ratio in finding the best-performing configurations.
This paper introduces a Generative Adversarial Nets (GAN) based, Load Profile Inpainting Network (Load-PIN) for restoring missing load data segments and estimating the baseline for a demand response event. The inputs are time series load data before and after the inpainting period together with explanatory variables (e.g., weather data). We propose a Generator structure consisting of a coarse network and a fine-tuning network. The coarse network provides an initial estimation of the data segment in the inpainting period. The fine-tuning network consists of self-attention blocks and gated convolution layers for adjusting the initial estimations. Loss functions are specially designed for the fine-tuning and the discriminator networks to enhance both the point-to-point accuracy and realisticness of the results. We test the Load-PIN on three real-world data sets for two applications: patching missing data and deriving baselines of conservation voltage reduction (CVR) events. We benchmark the performance of Load-PIN with five existing deep-learning methods. Our simulation results show that, compared with the state-of-the-art methods, Load-PIN can handle varying-length missing data events and achieve 15-30% accuracy improvement.
This paper presents a deep-learning framework, Multi-load Generative Adversarial Network (MultiLoad-GAN), for generating a group of load profiles in one shot. The main contribution of MultiLoad-GAN is the capture of spatial-temporal correlations among a group of loads to enable the generation of realistic synthetic load profiles in large quantity for meeting the emerging need in distribution system planning. The novelty and uniqueness of the MultiLoad-GAN framework are three-fold. First, it generates a group of load profiles bearing realistic spatial-temporal correlations in one shot. Second, two complementary metrics for evaluating realisticness of generated load profiles are developed: statistics metrics based on domain knowledge and a deep-learning classifier for comparing high-level features. Third, to tackle data scarcity, a novel iterative data augmentation mechanism is developed to generate training samples for enhancing the training of both the classifier and the MultiLoad-GAN model. Simulation results show that MultiLoad-GAN outperforms state-of-the-art approaches in realisticness, computational efficiency, and robustness. With little finetuning, the MultiLoad-GAN approach can be readily extended to generate a group of load or PV profiles for a feeder, a substation, or a service area.
This paper proposes a two-stage PV forecasting framework for MW-level PV farms based on Temporal Convolutional Network (TCN). In the day-ahead stage, inverter-level physics-based model is built to convert Numerical Weather Prediction (NWP) to hourly power forecasts. TCN works as the NWP blender to merge different NWP sources to improve the forecasting accuracy. In the real-time stage, TCN can leverage the spatial-temporal correlations between the target site and its neighbors to achieve intra-hour power forecasts. A scenario-based correlation analysis method is proposed to automatically identify the most contributive neighbors. Simulation results based on 95 PV farms in North Carolina demonstrate the accuracy and efficiency of the proposed method.
It is a common practice for utilities to down-sample smart meter measurements from high resolution (e.g. 1-min or 1-sec) to low resolution (e.g. 15-, 30- or 60-min) to lower the data transmission and storage cost. However, down-sampling can remove high-frequency components from time-series load profiles, making them unsuitable for in-depth studies such as quasi-static power flow analysis or non-intrusive load monitoring (NILM). Thus, in this paper, we propose ProfileSR-GAN: a Generative Adversarial Network (GAN) based load profile super-resolution (LPSR) framework for restoring high-frequency components lost through the smoothing effect of the down-sampling process. The LPSR problem is formulated as a Maximum-a-Prior problem. When training the ProfileSR-GAN generator network, to make the generated profiles more realistic, we introduce two new shape-related losses in addition to conventionally used content loss: adversarial loss and feature-matching loss. Moreover, a new set of shape-based evaluation metrics are proposed to evaluate the realisticness of the generated profiles. Simulation results show that ProfileSR-GAN outperforms Mean-Square Loss based methods in all shape-based metrics. The successful application in NILM further demonstrates that ProfileSR-GAN is effective in recovering high-resolution realistic waveforms.
This paper presents a meta-learning based, automatic distribution system load forecasting model selection framework. The framework includes the following processes: feature extraction, candidate model labeling, offline training, and online model recommendation. Using user load forecasting needs as input features, multiple meta-learners are used to rank the available load forecast models based on their forecasting accuracy. Then, a scoring-voting mechanism weights recommendations from each meta-leaner to make the final recommendations. Heterogeneous load forecasting tasks with different temporal and technical requirements at different load aggregation levels are set up to train, validate, and test the performance of the proposed framework. Simulation results demonstrate that the performance of the meta-learning based approach is satisfactory in both seen and unseen forecasting tasks.
Q learning is widely used to simulate the behaviors of generation companies (GenCos) in an electricity market. However, existing Q learning method usually requires numerous iterations to converge, which is time-consuming and inefficient in practice. To enhance the calculation efficiency, a novel Q learning algorithm improved by dichotomy is proposed in this paper. This method modifies the update process of the Q table by dichotomizing the state space and the action space step by step. Simulation results in a repeated Cournot game show the effectiveness of the proposed algorithm.