Time series analysis comprises statistical methods for analyzing a sequence of data points collected over an interval of time to identify interesting patterns and trends.
This paper introduces the MPS (Model Prediction Set), a novel framework for online model selection for nonstationary time series. Classical model selection methods, such as information criteria and cross-validation, rely heavily on the stationarity assumption and often fail in dynamic environments which undergo gradual or abrupt changes over time. Yet real-world data are rarely stationary, and model selection under nonstationarity remains a largely open problem. To tackle this challenge, we combine conformal inference with model confidence sets to develop a procedure that adaptively selects models best suited to the evolving dynamics at any given time. Concretely, the MPS updates in real time a confidence set of candidate models that covers the best model for the next time period with a specified long-run probability, while adapting to nonstationarity of unknown forms. Through simulations and real-world data analysis, we demonstrate that MPS reliably and efficiently identifies optimal models under nonstationarity, an essential capability lacking in offline methods. Moreover, MPS frequently produces high-quality sets with small cardinality, whose evolution offers deeper insights into changing dynamics. As a generic framework, MPS accommodates any data-generating process, data structure, model class, training method, and evaluation metric, making it broadly applicable across diverse problem settings.
Rapid expansion of model size has emerged as a key challenge in time series forecasting. From early Transformer with tens of megabytes to recent architectures like TimesNet with thousands of megabytes, performance gains have often come at the cost of exponentially increasing parameter counts. But is this scaling truly necessary? To question the applicability of the scaling law in time series forecasting, we propose Alinear, an ultra-lightweight forecasting model that achieves competitive performance using only k-level parameters. We introduce a horizon-aware adaptive decomposition mechanism that dynamically rebalances component emphasis across different forecast lengths, alongside a progressive frequency attenuation strategy that achieves stable prediction in various forecasting horizons without incurring the computational overhead of attention mechanisms. Extensive experiments on seven benchmark datasets demonstrate that Alinear consistently outperforms large-scale models while using less than 1% of their parameters, maintaining strong accuracy across both short and ultra-long forecasting horizons. Moreover, to more fairly evaluate model efficiency, we propose a new parameter-aware evaluation metric that highlights the superiority of ALinear under constrained model budgets. Our analysis reveals that the relative importance of trend and seasonal components varies depending on data characteristics rather than following a fixed pattern, validating the necessity of our adaptive design. This work challenges the prevailing belief that larger models are inherently better and suggests a paradigm shift toward more efficient time series modeling.
Multivariable time series forecasting methods can integrate information from exogenous variables, leading to significant prediction accuracy gains. Transformer architecture has been widely applied in various time series forecasting models due to its ability to capture long-range sequential dependencies. However, a na\"ive application of transformers often struggles to effectively model complex relationships among variables over time. To mitigate against this, we propose a novel architecture, namely the Spectral Operator Neural Network (Sonnet). Sonnet applies learnable wavelet transformations to the input and incorporates spectral analysis using the Koopman operator. Its predictive skill relies on the Multivariable Coherence Attention (MVCA), an operation that leverages spectral coherence to model variable dependencies. Our empirical analysis shows that Sonnet yields the best performance on $34$ out of $47$ forecasting tasks with an average mean absolute error (MAE) reduction of $1.1\%$ against the most competitive baseline (different per task). We further show that MVCA -- when put in place of the na\"ive attention used in various deep learning models -- can remedy its deficiencies, reducing MAE by $10.7\%$ on average in the most challenging forecasting tasks.
With Large language models (LLMs) becoming increasingly prevalent in various applications, the need for interpreting their predictions has become a critical challenge. As LLMs vary in architecture and some are closed-sourced, model-agnostic techniques show great promise without requiring access to the model's internal parameters. However, existing model-agnostic techniques need to invoke LLMs many times to gain sufficient samples for generating faithful explanations, which leads to high economic costs. In this paper, we show that it is practical to generate faithful explanations for large-scale LLMs by sampling from some budget-friendly models through a series of empirical studies. Moreover, we show that such proxy explanations also perform well on downstream tasks. Our analysis provides a new paradigm of model-agnostic explanation methods for LLMs, by including information from budget-friendly models.




The Earth's surface is subject to complex and dynamic processes, ranging from large-scale phenomena such as tectonic plate movements to localized changes associated with ecosystems, agriculture, or human activity. Satellite images enable global monitoring of these processes with extensive spatial and temporal coverage, offering advantages over in-situ methods. In particular, resulting satellite image time series (SITS) datasets contain valuable information. To handle their large volume and complexity, some recent works focus on the use of graph-based techniques that abandon the regular Euclidean structure of satellite data to work at an object level. Besides, graphs enable modelling spatial and temporal interactions between identified objects, which are crucial for pattern detection, classification and regression tasks. This paper is an effort to examine the integration of graph-based methods in spatio-temporal remote-sensing analysis. In particular, it aims to present a versatile graph-based pipeline to tackle SITS analysis. It focuses on the construction of spatio-temporal graphs from SITS and their application to downstream tasks. The paper includes a comprehensive review and two case studies, which highlight the potential of graph-based approaches for land cover mapping and water resource forecasting. It also discusses numerous perspectives to resolve current limitations and encourage future developments.
Deep learning models have significantly improved the ability to detect novelties in time series (TS) data. This success is attributed to their strong representation capabilities. However, due to the inherent variability in TS data, these models often struggle with generalization and robustness. To address this, a common approach is to perform Unsupervised Domain Adaptation, particularly Universal Domain Adaptation (UniDA), to handle domain shifts and emerging novel classes. While extensively studied in computer vision, UniDA remains underexplored for TS data. This work provides a comprehensive implementation and comparison of state-of-the-art TS backbones in a UniDA framework. We propose a reliable protocol to evaluate their robustness and generalization across different domains. The goal is to provide practitioners with a framework that can be easily extended to incorporate future advancements in UniDA and TS architectures. Our results highlight the critical influence of backbone selection in UniDA performance and enable a robustness analysis across various datasets and architectures.
In Computer Science (CS) education, understanding factors contributing to students' programming difficulties is crucial for effective learning support. By identifying specific issues students face, educators can provide targeted assistance to help them overcome obstacles and improve learning outcomes. While identifying sources of struggle, such as misconceptions, in real-time can be challenging in current educational practices, analyzing logical errors in students' code can offer valuable insights. This paper presents a scalable framework for automatically detecting logical errors in students' programming solutions. Our framework is based on an explainable Abstract Syntax Tree (AST) embedding model, the Subtree-based Attention Neural Network (SANN), that identifies the structural components of programs containing logical errors. We conducted a series of experiments to evaluate its effectiveness, and the results suggest that our framework can accurately capture students' logical errors and, more importantly, provide us with deeper insights into their learning processes, offering a valuable tool for enhancing programming education.
Accurate electricity load forecasting is essential for grid stability, resource optimization, and renewable energy integration. While transformer-based deep learning models like TimeGPT have gained traction in time-series forecasting, their effectiveness in long-term electricity load prediction remains uncertain. This study evaluates forecasting models ranging from classical regression techniques to advanced deep learning architectures using data from the ESD 2025 competition. The dataset includes two years of historical electricity load data, alongside temperature and global horizontal irradiance (GHI) across five sites, with a one-day-ahead forecasting horizon. Since actual test set load values remain undisclosed, leveraging predicted values would accumulate errors, making this a long-term forecasting challenge. We employ (i) Principal Component Analysis (PCA) for dimensionality reduction and (ii) frame the task as a regression problem, using temperature and GHI as covariates to predict load for each hour, (iii) ultimately stacking 24 models to generate yearly forecasts. Our results reveal that deep learning models, including TimeGPT, fail to consistently outperform simpler statistical and machine learning approaches due to the limited availability of training data and exogenous variables. In contrast, XGBoost, with minimal feature engineering, delivers the lowest error rates across all test cases while maintaining computational efficiency. This highlights the limitations of deep learning in long-term electricity forecasting and reinforces the importance of model selection based on dataset characteristics rather than complexity. Our study provides insights into practical forecasting applications and contributes to the ongoing discussion on the trade-offs between traditional and modern forecasting methods.
Multivariate time series (MTS) forecasting has a wide range of applications in both industry and academia. Recently, spatial-temporal graph neural networks (STGNNs) have gained popularity as MTS forecasting methods. However, current STGNNs can only use the finite length of MTS input data due to the computational complexity. Moreover, they lack the ability to identify similar patterns throughout the entire dataset and struggle with data that exhibit sparsely and discontinuously distributed correlations among variables over an extensive historical period, resulting in only marginal improvements. In this article, we introduce a simple yet effective k-nearest neighbor MTS forecasting ( kNN-MTS) framework, which forecasts with a nearest neighbor retrieval mechanism over a large datastore of cached series, using representations from the MTS model for similarity search. This approach requires no additional training and scales to give the MTS model direct access to the whole dataset at test time, resulting in a highly expressive model that consistently improves performance, and has the ability to extract sparse distributed but similar patterns spanning over multivariables from the entire dataset. Furthermore, a hybrid spatial-temporal encoder (HSTEncoder) is designed for kNN-MTS which can capture both long-term temporal and short-term spatial-temporal dependencies and is shown to provide accurate representation for kNN-MTSfor better forecasting. Experimental results on several real-world datasets show a significant improvement in the forecasting performance of kNN-MTS. The quantitative analysis also illustrates the interpretability and efficiency of kNN-MTS, showing better application prospects and opening up a new path for efficiently using the large dataset in MTS models.
Long-term time-series forecasting (LTSF) models are often presented as general-purpose solutions that can be applied across domains, implicitly assuming that all data is pointwise predictable. Using chaotic systems such as Lorenz-63 as a case study, we argue that geometric structure - not pointwise prediction - is the right abstraction for a dynamic-agnostic foundational model. Minimizing the Wasserstein-2 distance (W2), which captures geometric changes, and providing a spectral view of dynamics are essential for long-horizon forecasting. Our model, FRIREN (Flow-inspired Representations via Interpretable Eigen-networks), implements an augmented normalizing-flow block that embeds data into a normally distributed latent representation. It then generates a W2-efficient optimal path that can be decomposed into rotation, scaling, inverse rotation, and translation. This architecture yields locally generated, geometry-preserving predictions that are independent of the underlying dynamics, and a global spectral representation that functions as a finite Koopman operator with a small modification. This enables practitioners to identify which modes grow, decay, or oscillate, both locally and system-wide. FRIREN achieves an MSE of 11.4, MAE of 1.6, and SWD of 0.96 on Lorenz-63 in a 336-in, 336-out, dt=0.01 setting, surpassing TimeMixer (MSE 27.3, MAE 2.8, SWD 2.1). The model maintains effective prediction for 274 out of 336 steps, approximately 2.5 Lyapunov times. On Rossler (96-in, 336-out), FRIREN achieves an MSE of 0.0349, MAE of 0.0953, and SWD of 0.0170, outperforming TimeMixer's MSE of 4.3988, MAE of 0.886, and SWD of 3.2065. FRIREN is also competitive on standard LTSF datasets such as ETT and Weather. By connecting modern generative flows with classical spectral analysis, FRIREN makes long-term forecasting both accurate and interpretable, setting a new benchmark for LTSF model design.