Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Florian Ziel

Fast Training of Mixture-of-Experts for Time Series Forecasting via Expert Loss Integration

May 11, 2026

Btissame El Mahtout, Florian Ziel

Abstract:We propose a novel adaptive Mixture-of-Experts (MoE) framework for time series forecasting that enhances expert specialization by incorporating expert-specific loss information directly into the training process. Notably, the overall objective comprises the base forecasting loss and expert-specific losses, allowing expert-level prediction errors to jointly shape training alongside the global forecasting loss. This framework is further combined with a partial online learning strategy, enabling incremental updates of both the gating mechanism and expert parameters. This approach significantly reduces computational cost by eliminating the need for repeated full model retraining. By integrating expert-level loss awareness with efficient online optimization, the proposed method achieves improved learning efficiency while maintaining strong predictive performance. Empirical results across economic, tourism, and energy datasets with varying frequencies demonstrate that the proposed approach generally outperforms both statistical methods and state-of-the-art neural network models, such as Transformers and WaveNet, in forecasting accuracy and computational efficiency. Furthermore, ablation studies confirm the effectiveness of the expert-specific loss integration strategy, highlighting its contribution to enhancing predictive performance.

Via

Access Paper or Ask Questions

Energy-Arena: A Dynamic Benchmark for Operational Energy Forecasting

Apr 27, 2026

Max Kleinebrahm, Jonathan Berrisch, Philipp Eiser, Wolf Fichtner, Veit Hagenmeyer, Matthias Hertel, Nils Koster, Sebastian Lerch, Ralf Mikut, Jan Priesmann(+4 more)

Abstract:Energy forecasting research faces a persistent comparability gap that makes it difficult to measure consistent progress over time. Reported accuracy gains are often not directly comparable because models are evaluated under study-specific datasets, time periods, information sets, and scoring setups, while widely used benchmarks and competition datasets are typically tied to fixed historical windows. This paper introduces the Energy-Arena, a dynamic benchmarking platform for operational energy time series forecasting that provides a continuously updated reference point as energy systems evolve. The platform operates as an open, API-based submission system and standardizes challenge definitions and submission deadlines aligned with operational constraints. Performance is reported on rolling evaluation windows via persistent leaderboards. By moving from retrospective backtesting to forward-looking benchmarking, the Energy-Arena enforces standardized ex-ante submission and ex-post evaluation, thereby improving transparency by preventing information leakage and retroactive tuning. The platform is publicly available at Energy-Arena.org.

* 6 pages, 5 figures, 1 table. Submitted to the European Electricity Markets (EEM) conference

Via

Access Paper or Ask Questions

Electricity Price Forecasting: Bridging Linear Models, Neural Networks and Online Learning

Jan 06, 2026

Btissame El Mahtout, Florian Ziel

Abstract:Precise day-ahead forecasts for electricity prices are crucial to ensure efficient portfolio management, support strategic decision-making for power plant operations, enable efficient battery storage optimization, and facilitate demand response planning. However, developing an accurate prediction model is highly challenging in an uncertain and volatile market environment. For instance, although linear models generally exhibit competitive performance in predicting electricity prices with minimal computational requirements, they fail to capture relevant nonlinear relationships. Nonlinear models, on the other hand, can improve forecasting accuracy with a surge in computational costs. We propose a novel multivariate neural network approach that combines linear and nonlinear feed-forward neural structures. Unlike previous hybrid models, our approach integrates online learning and forecast combination for efficient training and accuracy improvement. It also incorporates all relevant characteristics, particularly the fundamental relationships arising from wind and solar generation, electricity demand patterns, related energy fuel and carbon markets, in addition to autoregressive dynamics and calendar effects. Compared to the current state-of-the-art benchmark models, the proposed forecasting method significantly reduces computational cost while delivering superior forecasting accuracy (12-13% RMSE and 15-18% MAE reductions). Our results are derived from a six-year forecasting study conducted on major European electricity markets.

Via

Access Paper or Ask Questions

Efficient mid-term forecasting of hourly electricity load using generalized additive models

May 27, 2024

Monika Zimmermann, Florian Ziel

Figure 1 for Efficient mid-term forecasting of hourly electricity load using generalized additive models

Figure 2 for Efficient mid-term forecasting of hourly electricity load using generalized additive models

Figure 3 for Efficient mid-term forecasting of hourly electricity load using generalized additive models

Figure 4 for Efficient mid-term forecasting of hourly electricity load using generalized additive models

Abstract:Accurate mid-term (weeks to one year) hourly electricity load forecasts are essential for strategic decision-making in power plant operation, ensuring supply security and grid stability, and energy trading. While numerous models effectively predict short-term (hours to a few days) hourly load, mid-term forecasting solutions remain scarce. In mid-term load forecasting, besides daily, weekly, and annual seasonal and autoregressive effects, capturing weather and holiday effects, as well as socio-economic non-stationarities in the data, poses significant modeling challenges. To address these challenges, we propose a novel forecasting method using Generalized Additive Models (GAMs) built from interpretable P-splines and enhanced with autoregressive post-processing. This model uses smoothed temperatures, Error-Trend-Seasonal (ETS) modeled non-stationary states, a nuanced representation of holiday effects with weekday variations, and seasonal information as input. The proposed model is evaluated on load data from 24 European countries. This analysis demonstrates that the model not only has significantly enhanced forecasting accuracy compared to state-of-the-art methods but also offers valuable insights into the influence of individual components on predicted load, given its full interpretability. Achieving performance akin to day-ahead TSO forecasts in fast computation times of a few seconds for several years of hourly data underscores the model's potential for practical application in the power system industry.

Via

Access Paper or Ask Questions

Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices

Mar 17, 2023

Jonathan Berrisch, Florian Ziel

Figure 1 for Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices

Figure 2 for Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices

Figure 3 for Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices

Figure 4 for Multivariate Probabilistic CRPS Learning with an Application to Day-Ahead Electricity Prices

Abstract:This paper presents a new method for combining (or aggregating or ensembling) multivariate probabilistic forecasts, taking into account dependencies between quantiles and covariates through a smoothing procedure that allows for online learning. Two smoothing methods are discussed: dimensionality reduction using Basis matrices and penalized smoothing. The new online learning algorithm generalizes the standard CRPS learning framework into multivariate dimensions. It is based on Bernstein Online Aggregation (BOA) and yields optimal asymptotic learning properties. We provide an in-depth discussion on possible extensions of the algorithm and several nested cases related to the existing literature on online forecast combination. The methodology is applied to forecasting day-ahead electricity prices, which are 24-dimensional distributional forecasts. The proposed method yields significant improvements over uniform combination in terms of continuous ranked probability score (CRPS). We discuss the temporal evolution of the weights and hyperparameters and present the results of reduced versions of the preferred model. A fast C++ implementation of all discussed methods is provided in the R-Package profoc.

Via

Access Paper or Ask Questions

Simulation-based Forecasting for Intraday Power Markets: Modelling Fundamental Drivers for Location, Shape and Scale of the Price Distribution

Nov 23, 2022

Simon Hirsch, Florian Ziel

Abstract:During the last years, European intraday power markets have gained importance for balancing forecast errors due to the rising volumes of intermittent renewable generation. However, compared to day-ahead markets, the drivers for the intraday price process are still sparsely researched. In this paper, we propose a modelling strategy for the location, shape and scale parameters of the return distribution in intraday markets, based on fundamental variables. We consider wind and solar forecasts and their intraday updates, outages, price information and a novel measure for the shape of the merit-order, derived from spot auction curves as explanatory variables. We validate our modelling by simulating price paths and compare the probabilistic forecasting performance of our model to benchmark models in a forecasting study for the German market. The approach yields significant improvements in the forecasting performance, especially in the tails of the distribution. At the same time, we are able to derive the contribution of the driving variables. We find that, apart from the first lag of the price changes, none of our fundamental variables have explanatory power for the expected value of the intraday returns. This implies weak-form market efficiency as renewable forecast changes and outage information seems to be priced in by the market. We find that the volatility is driven by the merit-order regime, the time to delivery and the closure of cross-border order books. The tail of the distribution is mainly influenced by past price differences and trading activity. Our approach is directly transferable to other continuous intraday markets in Europe.

Via

Access Paper or Ask Questions

Modeling Volatility and Dependence of European Carbon and Energy Prices

Aug 30, 2022

Jonathan Berrisch, Sven Pappert, Florian Ziel, Antonia Arsova

Figure 1 for Modeling Volatility and Dependence of European Carbon and Energy Prices

Figure 2 for Modeling Volatility and Dependence of European Carbon and Energy Prices

Figure 3 for Modeling Volatility and Dependence of European Carbon and Energy Prices

Figure 4 for Modeling Volatility and Dependence of European Carbon and Energy Prices

Abstract:We study the prices of European Emission Allowances (EUA), whereby we analyze their uncertainty and dependencies on related energy markets. We propose a probabilistic multivariate conditional time series model that exploits key characteristics of the data. The forecasting performance of the proposed model and various competing models is evaluated in an extensive rolling window forecasting study, covering almost two years out-of-sample. Thereby, we forecast 30-steps ahead. The accuracy of the multivariate probabilistic forecasts is assessed by the energy score. We discuss our findings focusing on volatility spillovers and time-varying correlations, also in view of the Russian invasion of Ukraine.

Via

Access Paper or Ask Questions

Distributional neural networks for electricity price forecasting

Jul 06, 2022

Grzegorz Marcjasz, Michał Narajewski, Rafał Weron, Florian Ziel

Figure 1 for Distributional neural networks for electricity price forecasting

Figure 2 for Distributional neural networks for electricity price forecasting

Figure 3 for Distributional neural networks for electricity price forecasting

Figure 4 for Distributional neural networks for electricity price forecasting

Abstract:We present a novel approach to probabilistic electricity price forecasting (EPF) which utilizes distributional artificial neural networks. The novel network structure for EPF is based on a regularized distributional multilayer perceptron (DMLP) which contains a probability layer. Using the TensorFlow Probability framework, the neural network's output is defined to be a distribution, either normal or potentially skewed and heavy-tailed Johnson's SU (JSU). The method is compared against state-of-the-art benchmarks in a forecasting study. The study comprises forecasting involving day-ahead electricity prices in the German market. The results show evidence of the importance of higher moments when modeling electricity prices.

Via

Access Paper or Ask Questions

High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks

Mar 07, 2022

Jonathan Berrisch, Michał Narajewski, Florian Ziel

Figure 1 for High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks

Figure 2 for High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks

Figure 3 for High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks

Figure 4 for High-Resolution Peak Demand Estimation Using Generalized Additive Models and Deep Neural Networks

Abstract:This paper presents a method for estimating high-resolution electricity peak demand given lower resolution data. The technique won a data competition organized by the British distribution network operator Western Power Distribution. The exercise was to estimate the minimum and maximum load values in a single substation in a one-minute resolution as precisely as possible. In contrast, the data was given in half-hourly and hourly resolutions. The winning method combines generalized additive models (GAM) and deep artificial neural networks (DNN) which are popular in load forecasting. We provide an extensive analysis of the prediction models, including the importance of input parameters with a focus on load, weather, and seasonal effects. In addition, we provide a rigorous evaluation study that goes beyond the competition frame to analyze the robustness. The results show that the proposed methods are superior, not only in the single competition month but also in the meaningful evaluation study.

Via

Access Paper or Ask Questions

M5 Competition Uncertainty: Overdispersion, distributional forecasting, GAMLSS and beyond

Jul 14, 2021

Florian Ziel

Figure 1 for M5 Competition Uncertainty: Overdispersion, distributional forecasting, GAMLSS and beyond

Figure 2 for M5 Competition Uncertainty: Overdispersion, distributional forecasting, GAMLSS and beyond

Figure 3 for M5 Competition Uncertainty: Overdispersion, distributional forecasting, GAMLSS and beyond

Figure 4 for M5 Competition Uncertainty: Overdispersion, distributional forecasting, GAMLSS and beyond

Abstract:The M5 competition uncertainty track aims for probabilistic forecasting of sales of thousands of Walmart retail goods. We show that the M5 competition data faces strong overdispersion and sporadic demand, especially zero demand. We discuss resulting modeling issues concerning adequate probabilistic forecasting of such count data processes. Unfortunately, the majority of popular prediction methods used in the M5 competition (e.g. lightgbm and xgboost GBMs) fails to address the data characteristics due to the considered objective functions. The distributional forecasting provides a suitable modeling approach for to the overcome those problems. The GAMLSS framework allows flexible probabilistic forecasting using low dimensional distributions. We illustrate, how the GAMLSS approach can be applied for the M5 competition data by modeling the location and scale parameter of various distributions, e.g. the negative binomial distribution. Finally, we discuss software packages for distributional modeling and their drawback, like the R package gamlss with its package extensions, and (deep) distributional forecasting libraries such as TensorFlow Probability.

Via

Access Paper or Ask Questions