Time series analysis comprises statistical methods for analyzing a sequence of data points collected over an interval of time to identify interesting patterns and trends.
Forecasting technological advancement in complex domains such as space exploration presents significant challenges due to the intricate interaction of technical, economic, and policy-related factors. The field of technology forecasting has long relied on quantitative trend extrapolation techniques, such as growth curves (e.g., Moore's law) and time series models, to project technological progress. To assess the current state of these methods, we conducted an updated systematic literature review (SLR) that incorporates recent advances. This review highlights a growing trend toward machine learning-based hybrid models. Motivated by this review, we developed a forecasting model that combines long short-term memory (LSTM) neural networks with an augmentation of Moore's law to predict spacecraft lifetimes. Operational lifetime is an important engineering characteristic of spacecraft and a potential proxy for technological progress in space exploration. Lifetimes were modeled as depending on launch date and additional predictors. Our modeling analysis introduces a novel advance in the recently introduced Start Time End Time Integration (STETI) approach. STETI addresses a critical right censoring problem known to bias lifetime analyses: the more recent the launch dates, the shorter the lifetimes of the spacecraft that have failed and can thus contribute lifetime data. Longer-lived spacecraft are still operating and therefore do not contribute data. This systematically distorts putative lifetime versus launch date curves by biasing lifetime estimates for recent launch dates downward. STETI mitigates this distortion by interconverting between expressing lifetimes as functions of launch time and modeling them as functions of failure time. The results provide insights relevant to space mission planning and policy decision-making.
With the growing popularity of electric vehicles as a means of addressing climate change, concerns have emerged regarding their impact on electric grid management. As a result, predicting EV charging demand has become a timely and important research problem. While substantial research has addressed energy load forecasting in transportation, relatively few studies systematically compare multiple forecasting methods across different temporal horizons and spatial aggregation levels in diverse urban settings. This work investigates the effectiveness of five time series forecasting models, ranging from traditional statistical approaches to machine learning and deep learning methods. Forecasting performance is evaluated for short-, mid-, and long-term horizons (on the order of minutes, hours, and days, respectively), and across spatial scales ranging from individual charging stations to regional and city-level aggregations. The analysis is conducted on four publicly available real-world datasets, with results reported independently for each dataset. To the best of our knowledge, this is the first work to systematically evaluate EV charging demand forecasting across such a wide range of temporal horizons and spatial aggregation levels using multiple real-world datasets.




Diffusion models have shown promise in forecasting future data from multivariate time series. However, few existing methods account for recurring structures, or patterns, that appear within the data. We present Pattern-Guided Diffusion Models (PGDM), which leverage inherent patterns within temporal data for forecasting future time steps. PGDM first extracts patterns using archetypal analysis and estimates the most likely next pattern in the sequence. By guiding predictions with this pattern estimate, PGDM makes more realistic predictions that fit within the set of known patterns. We additionally introduce a novel uncertainty quantification technique based on archetypal analysis, and we dynamically scale the guidance level based on the pattern estimate uncertainty. We apply our method to two well-motivated forecasting applications, predicting visual field measurements and motion capture frames. On both, we show that pattern guidance improves PGDM's performance (MAE / CRPS) by up to 40.67% / 56.26% and 14.12% / 14.10%, respectively. PGDM also outperforms baselines by up to 65.58% / 84.83% and 93.64% / 92.55%.
Accurate uncertainty quantification is a critical challenge in machine learning. While neural networks are highly versatile and capable of learning complex patterns, they often lack interpretability due to their ``black box'' nature. On the other hand, probabilistic ``white box'' models, though interpretable, often suffer from a significant performance gap when compared to neural networks. To address this, we propose a novel quantum physics-based ``white box'' method that offers both accurate uncertainty quantification and enhanced interpretability. By mapping the kernel mean embedding (KME) of a time series data vector to a reproducing kernel Hilbert space (RKHS), we construct a tensor network-inspired 1D spin chain Hamiltonian, with the KME as one of its eigen-functions or eigen-modes. We then solve the associated Schr{ö}dinger equation and apply perturbation theory to quantify uncertainty, thereby improving the interpretability of tasks performed with the quantum tensor network-based model. We demonstrate the effectiveness of this methodology, compared to state-of-the-art ``white box" models, in change point detection and time series clustering, providing insights into the uncertainties associated with decision-making throughout the process.
In the time-series domain, an increasing number of works combine text with temporal data to leverage the reasoning capabilities of large language models (LLMs) for various downstream time-series understanding tasks. This enables a single model to flexibly perform tasks that previously required specialized models for each domain. However, these methods typically rely on text labels for supervision during training, biasing the model toward textual cues while potentially neglecting the full temporal features. Such a bias can lead to outputs that contradict the underlying time-series context. To address this issue, we construct the EvalTS benchmark, comprising 10 tasks across three difficulty levels, from fundamental temporal pattern recognition to complex real-world reasoning, to evaluate models under more challenging and realistic scenarios. We also propose TimeSense, a multimodal framework that makes LLMs proficient in time-series analysis by balancing textual reasoning with a preserved temporal sense. TimeSense incorporates a Temporal Sense module that reconstructs the input time-series within the model's context, ensuring that textual reasoning is grounded in the time-series dynamics. Moreover, to enhance spatial understanding of time-series data, we explicitly incorporate coordinate-based positional embeddings, which provide each time point with spatial context and enable the model to capture structural dependencies more effectively. Experimental results demonstrate that TimeSense achieves state-of-the-art performance across multiple tasks, and it particularly outperforms existing methods on complex multi-dimensional time-series reasoning tasks.
This paper does not introduce a novel method but instead establishes a straightforward, incremental, yet essential baseline for video temporal grounding (VTG), a core capability in video understanding. While multimodal large language models (MLLMs) excel at various video understanding tasks, the recipes for optimizing them for VTG remain under-explored. In this paper, we present TimeLens, a systematic investigation into building MLLMs with strong VTG ability, along two primary dimensions: data quality and algorithmic design. We first expose critical quality issues in existing VTG benchmarks and introduce TimeLens-Bench, comprising meticulously re-annotated versions of three popular benchmarks with strict quality criteria. Our analysis reveals dramatic model re-rankings compared to legacy benchmarks, confirming the unreliability of prior evaluation standards. We also address noisy training data through an automated re-annotation pipeline, yielding TimeLens-100K, a large-scale, high-quality training dataset. Building on our data foundation, we conduct in-depth explorations of algorithmic design principles, yielding a series of meaningful insights and effective yet efficient practices. These include interleaved textual encoding for time representation, a thinking-free reinforcement learning with verifiable rewards (RLVR) approach as the training paradigm, and carefully designed recipes for RLVR training. These efforts culminate in TimeLens models, a family of MLLMs with state-of-the-art VTG performance among open-source models and even surpass proprietary models such as GPT-5 and Gemini-2.5-Flash. All codes, data, and models will be released to facilitate future research.




This paper presents a unified framework, for the detection, classification, and preliminary localization of anomalies in water distribution networks using multivariate statistical analysis. The approach, termed SICAMS (Statistical Identification and Classification of Anomalies in Mahalanobis Space), processes heterogeneous pressure and flow sensor data through a whitening transformation to eliminate spatial correlations among measurements. Based on the transformed data, the Hotelling's $T^2$ statistic is constructed, enabling the formulation of anomaly detection as a statistical hypothesis test of network conformity to normal operating conditions. It is shown that Hotelling's $T^2$ statistic can serve as an integral indicator of the overall "health" of the system, exhibiting correlation with total leakage volume, and thereby enabling approximate estimation of water losses via a regression model. A heuristic algorithm is developed to analyze the $T^2$ time series and classify detected anomalies into abrupt leaks, incipient leaks, and sensor malfunctions. Furthermore, a coarse leak localization method is proposed, which ranks sensors according to their statistical contribution and employs Laplacian interpolation to approximate the affected region within the network. Application of the proposed framework to the BattLeDIM L-Town benchmark dataset demonstrates high sensitivity and reliability in leak detection, maintaining robust performance even under multiple leaks. These capabilities make the method applicable to real-world operational environments without the need for a calibrated hydraulic model.
Power system time series analytics is critical in understanding the system operation conditions and predicting the future trends. Despite the wide adoption of Artificial Intelligence (AI) tools, many AI-based time series analytical models suffer from task-specificity (i.e. one model for one task) and structural rigidity (i.e. the input-output format is fixed), leading to limited model performances and resource wastes. In this paper, we propose a Causal-Guided Multimodal Large Language Model (CM-LLM) that can solve heterogeneous power system time-series analysis tasks. First, we introduce a physics-statistics combined causal discovery mechanism to capture the causal relationship, which is represented by graph, among power system variables. Second, we propose a multimodal data preprocessing framework that can encode and fuse text, graph and time series to enhance the model performance. Last, we formulate a generic "mask-and-reconstruct" paradigm and design a dynamic input-output padding mechanism to enable CM-LLM adaptive to heterogeneous time-series analysis tasks with varying sample lengths. Simulation results based on open-source LLM Qwen and real-world dataset demonstrate that, after simple fine-tuning, the proposed CM-LLM can achieve satisfying accuracy and efficiency on three heterogeneous time-series analytics tasks: missing data imputation, forecasting and super resolution.
Understanding and distinguishing temporal patterns in time series data is essential for scientific discovery and decision-making. For example, in biomedical research, uncovering meaningful patterns in physiological signals can improve diagnosis, risk assessment, and patient outcomes. However, existing methods for time series pattern discovery face major challenges, including high computational complexity, limited interpretability, and difficulty in capturing meaningful temporal structures. To address these gaps, we introduce a novel learning framework that jointly trains two Transformer models using complementary time series representations: shapelet-based representations to capture localized temporal structures and traditional feature engineering to encode statistical properties. The learned shapelets serve as interpretable signatures that differentiate time series across classification labels. Additionally, we develop a visual analytics system -- SigTIme -- with coordinated views to facilitate exploration of time series signatures from multiple perspectives, aiding in useful insights generation. We quantitatively evaluate our learning framework on eight publicly available datasets and one proprietary clinical dataset. Additionally, we demonstrate the effectiveness of our system through two usage scenarios along with the domain experts: one involving public ECG data and the other focused on preterm labor analysis.
Kernel-based methods such as Rocket are among the most effective default approaches for univariate time series classification (TSC), yet they do not perform equally well across all datasets. We revisit the long-standing intuition that different representations capture complementary structure and show that selectively fusing them can yield consistent improvements over Rocket on specific, systematically identifiable kinds of datasets. We introduce Fusion-3 (F3), a lightweight framework that adaptively fuses Rocket, Sax, and Sfa representations. To understand when fusion helps, we cluster UCR datasets into six groups using meta-features capturing series length, spectral structure, roughness, and class imbalance, and treat these clusters as interpretable data-structure regimes. Our analysis shows that fusion typically outperforms strong baselines in regimes with structured variability or rich frequency content, while offering diminishing returns in highly irregular or outlier-heavy settings. To support these findings, we combine three complementary analyses: non-parametric paired statistics across datasets, ablation studies isolating the roles of individual representations, and attribution via SHAP to identify which dataset properties predict fusion gains. Sample-level case studies further reveal the underlying mechanism: fusion primarily improves performance by rescuing specific errors, with adaptive increases in frequency-domain weighting precisely where corrections occur. Using 5-fold cross-validation on the 113 UCR datasets, F3 yields small but consistent average improvements over Rocket, supported by frequentist and Bayesian evidence and accompanied by clearly identifiable failure cases. Our results show that selectively applied fusion provides dependable and interpretable extension to strong kernel-based methods, correcting their weaknesses precisely where the data support it.