Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Time Series Analysis": models, code, and papers

Forecast Evaluation for Data Scientists: Common Pitfalls and Best Practices

Apr 04, 2022
Hansika Hewamalage, Klaus Ackermann, Christoph Bergmeir

Machine Learning (ML) and Deep Learning (DL) methods are increasingly replacing traditional methods in many domains involved with important decision making activities. DL techniques tailor-made for specific tasks such as image recognition, signal processing, or speech analysis are being introduced at a fast pace with many improvements. However, for the domain of forecasting, the current state in the ML community is perhaps where other domains such as Natural Language Processing and Computer Vision were at several years ago. The field of forecasting has mainly been fostered by statisticians/econometricians; consequently the related concepts are not the mainstream knowledge among general ML practitioners. The different non-stationarities associated with time series challenge the data-driven ML models. Nevertheless, recent trends in the domain have shown that with the availability of massive amounts of time series, ML techniques are quite competent in forecasting, when related pitfalls are properly handled. Therefore, in this work we provide a tutorial-like compilation of the details of one of the most important steps in the overall forecasting process, namely the evaluation. This way, we intend to impart the information of forecast evaluation to fit the context of ML, as means of bridging the knowledge gap between traditional methods of forecasting and state-of-the-art ML techniques. We elaborate on the different problematic characteristics of time series such as non-normalities and non-stationarities and how they are associated with common pitfalls in forecast evaluation. Best practices in forecast evaluation are outlined with respect to the different steps such as data partitioning, error calculation, statistical testing, and others. Further guidelines are also provided along selecting valid and suitable error measures depending on the specific characteristics of the dataset at hand.

  
Access Paper or Ask Questions

Joint learning of multiple Granger causal networks via non-convex regularizations: Inference of group-level brain connectivity

May 22, 2021
Parinthorn Manomaisaowapak, Jitkomut Songsiri

This paper considers joint learning of multiple sparse Granger graphical models to discover underlying common and differential Granger causality (GC) structures across multiple time series. This can be applied to drawing group-level brain connectivity inferences from a homogeneous group of subjects or discovering network differences among groups of signals collected under heterogeneous conditions. By recognizing that the GC of a single multivariate time series can be characterized by common zeros of vector autoregressive (VAR) lag coefficients, a group sparse prior is included in joint regularized least-squares estimations of multiple VAR models. Group-norm regularizations based on group- and fused-lasso penalties encourage a decomposition of multiple networks into a common GC structure, with other remaining parts defined in individual-specific networks. Prior information about sparseness and sparsity patterns of desired GC networks are incorporated as relative weights, while a non-convex group norm in the penalty is proposed to enhance the accuracy of network estimation in low-sample settings. Extensive numerical results on simulations illustrated our method's improvements over existing sparse estimation approaches on GC network sparsity recovery. Our methods were also applied to available resting-state fMRI time series from the ADHD-200 data sets to learn the differences of causality mechanisms, called effective brain connectivity, between adolescents with ADHD and typically developing children. Our analysis revealed that parts of the causality differences between the two groups often resided in the orbitofrontal region and areas associated with the limbic system, which agreed with clinical findings and data-driven results in previous studies.

* 23 pages, 9 figures, 3 tables, Comments: Title changed, Math expression corrected in the formulation and algorithm section, More references and discussions are added to explain difficulties in the convergence analysis 
  
Access Paper or Ask Questions

Joint estimation of multiple Granger causal networks: Inference of group-level brain connectivity

May 15, 2021
Parinthorn Manomaisaowapak, Jitkomut Songsiri

This paper considers joint learning of multiple sparse Granger graphical models to discover underlying common and differential Granger causality (GC) structures across multiple time series. This can be applied to drawing group-level brain connectivity inferences from a homogeneous group of subjects or discovering network differences among groups of signals collected under heterogeneous conditions. By recognizing that the GC of a single multivariate time series can be characterized by common zeros of vector autoregressive (VAR) lag coefficients, a group sparse prior is included in joint regularized least-squares estimations of multiple VAR models. Group-norm regularizations based on group- and fused-lasso penalties encourage a decomposition of multiple networks into a common GC structure, with other remaining parts defined in individual-specific networks. Prior information about sparseness and sparsity patterns of desired GC networks are incorporated as relative weights, while a non-convex group norm in the penalty is proposed to enhance the accuracy of network estimation in low-sample settings. Extensive numerical results on simulations illustrated our method's improvements over existing sparse estimation approaches on GC network sparsity recovery. Our methods were also applied to available resting-state fMRI time series from the ADHD-200 data sets to learn the differences of causality mechanisms, called effective brain connectivity, between adolescents with ADHD and typically developing children. Our analysis revealed that parts of the causality differences between the two groups often resided in the orbitofrontal region and areas associated with the limbic system, which agreed with clinical findings and data-driven results in previous studies.

* 22 pages, 9 figures, 2 tables 
  
Access Paper or Ask Questions

Higher-Order Visualization of Causal Structures in Dynamics Graphs

Aug 16, 2019
Vincenzo Perri, Ingo Scholtes

Graph drawing and visualisation techniques are important tools for the exploratory analysis of complex systems. While these methods are regularly applied to visualise data on complex networks, we increasingly have access to time series data that can be modelled as temporal networks or dynamic graphs. In such dynamic graphs, the temporal ordering of time-stamped edges determines the causal topology of a system, i.e. which nodes can directly and indirectly influence each other via a so-called causal path. While this causal topology is crucial to understand dynamical processes, the role of nodes, or cluster structures, we lack graph drawing techniques that incorporate this information into static visualisations. Addressing this gap, we present a novel dynamic graph drawing algorithm that utilises higher-order graphical models of causal paths in time series data to compute time-aware static graph visualisations. These visualisations combine the simplicity of static graphs with a time-aware layout algorithm that highlights patterns in the causal topology that result from the temporal dynamics of edges.

  
Access Paper or Ask Questions

A Direct Estimation of High Dimensional Stationary Vector Autoregressions

Oct 29, 2014
Fang Han, Huanran Lu, Han Liu

The vector autoregressive (VAR) model is a powerful tool in modeling complex time series and has been exploited in many fields. However, fitting high dimensional VAR model poses some unique challenges: On one hand, the dimensionality, caused by modeling a large number of time series and higher order autoregressive processes, is usually much higher than the time series length; On the other hand, the temporal dependence structure in the VAR model gives rise to extra theoretical challenges. In high dimensions, one popular approach is to assume the transition matrix is sparse and fit the VAR model using the "least squares" method with a lasso-type penalty. In this manuscript, we propose an alternative way in estimating the VAR model. The main idea is, via exploiting the temporal dependence structure, to formulate the estimating problem into a linear program. There is instant advantage for the proposed approach over the lasso-type estimators: The estimation equation can be decomposed into multiple sub-equations and accordingly can be efficiently solved in a parallel fashion. In addition, our method brings new theoretical insights into the VAR model analysis. So far the theoretical results developed in high dimensions (e.g., Song and Bickel (2011) and Kock and Callot (2012)) mainly pose assumptions on the design matrix of the formulated regression problems. Such conditions are indirect about the transition matrices and not transparent. In contrast, our results show that the operator norm of the transition matrices plays an important role in estimation accuracy. We provide explicit rates of convergence for both estimation and prediction. In addition, we provide thorough experiments on both synthetic and real-world equity data to show that there are empirical advantages of our method over the lasso-type estimators in both parameter estimation and forecasting.

* 36 pages, 3 figure 
  
Access Paper or Ask Questions

Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling

Jul 14, 2019
Feras A. Saad, Marco F. Cusumano-Towner, Ulrich Schaechtle, Martin C. Rinard, Vikash K. Mansinghka

We present new techniques for automatically constructing probabilistic programs for data analysis, interpretation, and prediction. These techniques work with probabilistic domain-specific data modeling languages that capture key properties of a broad class of data generating processes, using Bayesian inference to synthesize probabilistic programs in these modeling languages given observed data. We provide a precise formulation of Bayesian synthesis for automatic data modeling that identifies sufficient conditions for the resulting synthesis procedure to be sound. We also derive a general class of synthesis algorithms for domain-specific languages specified by probabilistic context-free grammars and establish the soundness of our approach for these languages. We apply the techniques to automatically synthesize probabilistic programs for time series data and multivariate tabular data. We show how to analyze the structure of the synthesized programs to compute, for key qualitative properties of interest, the probability that the underlying data generating process exhibits each of these properties. Second, we translate probabilistic programs in the domain-specific language into probabilistic programs in Venture, a general-purpose probabilistic programming system. The translated Venture programs are then executed to obtain predictions of new time series data and new multivariate data records. Experimental results show that our techniques can accurately infer qualitative structure in multiple real-world data sets and outperform standard data analysis methods in forecasting and predicting new data.

* Proc. ACM Program. Lang. 3, POPL, Article 37 (January 2019) 
  
Access Paper or Ask Questions

Changepoint detection for high-dimensional time series with missing data

Dec 07, 2012
Yao Xie, Jiaji Huang, Rebecca Willett

This paper describes a novel approach to change-point detection when the observed high-dimensional data may have missing elements. The performance of classical methods for change-point detection typically scales poorly with the dimensionality of the data, so that a large number of observations are collected after the true change-point before it can be reliably detected. Furthermore, missing components in the observed data handicap conventional approaches. The proposed method addresses these challenges by modeling the dynamic distribution underlying the data as lying close to a time-varying low-dimensional submanifold embedded within the ambient observation space. Specifically, streaming data is used to track a submanifold approximation, measure deviations from this approximation, and calculate a series of statistics of the deviations for detecting when the underlying manifold has changed in a sharp or unexpected manner. The approach described in this paper leverages several recent results in the field of high-dimensional data analysis, including subspace tracking with missing data, multiscale analysis techniques for point clouds, online optimization, and change-point detection performance analysis. Simulations and experiments highlight the robustness and efficacy of the proposed approach in detecting an abrupt change in an otherwise slowly varying low-dimensional manifold.

  
Access Paper or Ask Questions

Fast Distribution Grid Line Outage Identification with $μ$PMU

Nov 14, 2018
Yizheng Liao, Yang Weng, Chin-Woo Tan, Ram Rajagopal

The growing integration of distributed energy resources (DERs) in urban distribution grids raises various reliability issues due to DER's uncertain and complex behaviors. With a large-scale DER penetration, traditional outage detection methods, which rely on customers making phone calls and smart meters' "last gasp" signals, will have limited performance, because the renewable generators can supply powers after line outages and many urban grids are mesh so line outages do not affect power supply. To address these drawbacks, we propose a data-driven outage monitoring approach based on the stochastic time series analysis from micro phasor measurement unit ($\mu$PMU). Specifically, we prove via power flow analysis that the dependency of time-series voltage measurements exhibits significant statistical changes after line outages. This makes the theory on optimal change-point detection suitable to identify line outages via $\mu$PMUs with fast and accurate sampling. However, existing change point detection methods require post-outage voltage distribution unknown in distribution systems. Therefore, we design a maximum likelihood-based method to directly learn the distribution parameters from $\mu$PMU data. We prove that the estimated parameters-based detection still achieves the optimal performance, making it extremely useful for distribution grid outage identifications. Simulation results show highly accurate outage identification in eight distribution grids with 14 configurations with and without DERs using $\mu$PMU data.

* 9 pages 
  
Access Paper or Ask Questions

Revisiting Inaccuracies of Time Series Averaging under Dynamic Time Warping

Sep 07, 2018
Brijnesh Jain

This article revisits an analysis on inaccuracies of time series averaging under dynamic time warping conducted by \cite{Niennattrakul2007}. The authors presented a correctness-criterion and introduced drift-outs of averages from clusters. They claimed that averages are inaccurate if they are incorrect or drift-outs. Furthermore, they conjectured that such inaccuracies are caused by the lack of triangle inequality. We show that a rectified version of the correctness-criterion is unsatisfiable and that the concept of drift-out is geometrically and operationally inconclusive. Satisfying the triangle inequality is insufficient to achieve correctness and unnecessary to overcome the drift-out phenomenon. We place the concept of drift-out on a principled basis and show that sample means as global minimizers of a Fr\'echet function never drift out. The adjusted drift-out is a way to test to which extent an approximation is coherent. Empirical results show that solutions obtained by the state-of-the-art methods SSG and DBA are incoherent approximations of a sample mean in over a third of all trials.

  
Access Paper or Ask Questions

Quick Line Outage Identification in Urban Distribution Grids via Smart Meters

Apr 01, 2021
Yizheng Liao, Yang Weng, Chin-woo Tan, Ram Rajagopal

The growing integration of distributed energy resources (DERs) in distribution grids raises various reliability issues due to DER's uncertain and complex behaviors. With a large-scale DER penetration in distribution grids, traditional outage detection methods, which rely on customers report and smart meters' last gasp signals, will have poor performance, because the renewable generators and storages and the mesh structure in urban distribution grids can continue supplying power after line outages. To address these challenges, we propose a data-driven outage monitoring approach based on the stochastic time series analysis with a theoretical guarantee. Specifically, we prove via power flow analysis that the dependency of time-series voltage measurements exhibits significant statistical changes after line outages. This makes the theory on optimal change-point detection suitable to identify line outages. However, existing change point detection methods require post-outage voltage distribution, which is unknown in distribution systems. Therefore, we design a maximum likelihood estimator to directly learn the distribution parameters from voltage data. We prove that the estimated parameters-based detection also achieves the optimal performance, making it extremely useful for fast distribution grid outage identifications. Furthermore, since smart meters have been widely installed in distribution grids and advanced infrastructure (e.g., PMU) has not widely been available, our approach only requires voltage magnitude for quick outage identification. Simulation results show highly accurate outage identification in eight distribution grids with 14 configurations with and without DERs using smart meter data.

* 12 pages, 12 figures. arXiv admin note: substantial text overlap with arXiv:1811.05646 
  
Access Paper or Ask Questions
<<
43
44
45
46
47
48
49
50
>>