Models, code, and papers for "Time Series Analysis":

Highly comparative time-series analysis: The empirical structure of time series and their methods

Apr 03, 2013
Ben D. Fulcher, Max A. Little, Nick S. Jones

The process of collecting and organizing sets of observations represents a common theme throughout the history of science. However, despite the ubiquity of scientists measuring, recording, and analyzing the dynamics of different processes, an extensive organization of scientific time-series data and analysis methods has never been performed. Addressing this, annotated collections of over 35 000 real-world and model-generated time series and over 9000 time-series analysis algorithms are analyzed in this work. We introduce reduced representations of both time series, in terms of their properties measured by diverse scientific methods, and of time-series analysis methods, in terms of their behaviour on empirical time series, and use them to organize these interdisciplinary resources. This new approach to comparing across diverse scientific data and methods allows us to organize time-series datasets automatically according to their properties, retrieve alternatives to particular analysis methods developed in other scientific disciplines, and automate the selection of useful methods for time-series classification and regression tasks. The broad scientific utility of these tools is demonstrated on datasets of electroencephalograms, self-affine time series, heart beat intervals, speech signals, and others, in each case contributing novel analysis techniques to the existing literature. Highly comparative techniques that compare across an interdisciplinary literature can thus be used to guide more focused research in time-series analysis for applications across the scientific disciplines.

* J. R. Soc. Interface vol. 10 no. 83 20130048 (2013) 

  Access Model/Code and Paper
Feature-based time-series analysis

Oct 02, 2017
Ben D. Fulcher

This work presents an introduction to feature-based time-series analysis. The time series as a data type is first described, along with an overview of the interdisciplinary time-series analysis literature. I then summarize the range of feature-based representations for time series that have been developed to aid interpretable insights into time-series structure. Particular emphasis is given to emerging research that facilitates wide comparison of feature-based representations that allow us to understand the properties of a time-series dataset that make it suited to a particular feature-based representation or analysis algorithm. The future of time-series analysis is likely to embrace approaches that exploit machine learning methods to partially automate human learning to aid understanding of the complex dynamical patterns in the time series we measure from the world.

* 28 pages, 9 figures 

  Access Model/Code and Paper
Multilevel Wavelet Decomposition Network for Interpretable Time Series Analysis

Jun 23, 2018
Jingyuan Wang, Ze Wang, Jianfeng Li, Junjie Wu

Recent years have witnessed the unprecedented rising of time series from almost all kindes of academic and industrial fields. Various types of deep neural network models have been introduced to time series analysis, but the important frequency information is yet lack of effective modeling. In light of this, in this paper we propose a wavelet-based neural network structure called multilevel Wavelet Decomposition Network (mWDN) for building frequency-aware deep learning models for time series analysis. mWDN preserves the advantage of multilevel discrete wavelet decomposition in frequency learning while enables the fine-tuning of all parameters under a deep neural network framework. Based on mWDN, we further propose two deep learning models called Residual Classification Flow (RCF) and multi-frequecy Long Short-Term Memory (mLSTM) for time series classification and forecasting, respectively. The two models take all or partial mWDN decomposed sub-series in different frequencies as input, and resort to the back propagation algorithm to learn all the parameters globally, which enables seamless embedding of wavelet-based frequency analysis into deep learning frameworks. Extensive experiments on 40 UCR datasets and a real-world user volume dataset demonstrate the excellent performance of our time series models based on mWDN. In particular, we propose an importance analysis method to mWDN based models, which successfully identifies those time-series elements and mWDN layers that are crucially important to time series analysis. This indeed indicates the interpretability advantage of mWDN, and can be viewed as an indepth exploration to interpretable deep learning.


  Access Model/Code and Paper
Granger Mediation Analysis of Multiple Time Series with an Application to fMRI

Sep 15, 2017
Yi Zhao, Xi Luo

It becomes increasingly popular to perform mediation analysis for complex data from sophisticated experimental studies. In this paper, we present Granger Mediation Analysis (GMA), a new framework for causal mediation analysis of multiple time series. This framework is motivated by a functional magnetic resonance imaging (fMRI) experiment where we are interested in estimating the mediation effects between a randomized stimulus time series and brain activity time series from two brain regions. The stable unit treatment assumption for causal mediation analysis is thus unrealistic for this type of time series data. To address this challenge, our framework integrates two types of models: causal mediation analysis across the variables and vector autoregressive models across the temporal observations. We further extend this framework to handle multilevel data to address individual variability and correlated errors between the mediator and the outcome variables. These models not only provide valid causal mediation for time series data but also model the causal dynamics across time. We show that the modeling parameters in our models are identifiable, and we develop computationally efficient methods to maximize the likelihood-based optimization criteria. Simulation studies show that our method reduces the estimation bias and improve statistical power, compared to existing approaches. On a real fMRI data set, our approach not only infers the causal effects of brain pathways but accurately captures the feedback effect of the outcome region on the mediator region.

* 59 pages. Presented at the 2017 ENAR, JSM, and other meetings 

  Access Model/Code and Paper
Time Series Analysis via Matrix Estimation

Aug 24, 2018
Anish Agarwal, Muhammad Jehangir Amjad, Devavrat Shah, Dennis Shen

We propose an algorithm to impute and forecast a time series by transforming the observed time series into a matrix, utilizing matrix estimation to recover missing values and de-noise observed entries, and performing linear regression to make predictions. At the core of our analysis is a representation result, which states that for a large model class, the transformed matrix obtained from the time series via our algorithm is (approximately) low-rank. This, in effect, generalizes the widely used Singular Spectrum Analysis (SSA) in literature, and allows us to establish a rigorous link between time series analysis and matrix estimation. The key is to construct a matrix with non-overlapping entries rather than with the Hankel matrix as done in the literature, including in SSA. We provide finite sample analysis for imputation and prediction leading to the asymptotic consistency of our method. A salient feature of our algorithm is that it is model agnostic both with respect to the underlying time dynamics as well as the noise model in the observations. Being noise agnostic makes our algorithm applicable to the setting where the state is hidden and we only have access to its noisy observations a la a Hidden Markov Model, e.g., observing a Poisson process with a time-varying parameter without knowing that the process is Poisson, but still recovering the time-varying parameter accurately. As part of the forecasting algorithm, an important task is to perform regression with noisy observations of the features a la an error- in-variable regression. In essence, our approach suggests a matrix estimation based method for such a setting, which could be of interest in its own right. Through synthetic and real-world datasets, we demonstrate that our algorithm outperforms standard software packages (including R libraries) in the presence of missing data as well as high levels of noise.


  Access Model/Code and Paper
Time Series Analysis of Electricity Price and Demand to Find Cyber-attacks using Stationary Analysis

Aug 20, 2019
Mohsen Rakhshandehroo, Mohammad Rajabdorri

With developing of computation tools in the last years, data analysis methods to find insightful information are becoming more common among industries and researchers. This paper is the first part of the times series analysis of New England electricity price and demand to find anomaly in the data. In this paper time-series stationary criteria to prepare data for further times-series related analysis is investigated. Three main analysis are conducted in this paper, including moving average, moving standard deviation and augmented Dickey-Fuller test. The data used in this paper is New England big data from 9 different operational zones. For each zone, 4 different variables including day-ahead (DA) electricity demand, price and real-time (RT) electricity demand price are considered.

* 9pages, 13 figs, 4 tables 

  Access Model/Code and Paper
Deep Learning for Time-Series Analysis

Jan 07, 2017
John Cristian Borges Gamboa

In many real-world application, e.g., speech recognition or sleep stage classification, data are captured over the course of time, constituting a Time-Series. Time-Series often contain temporal dependencies that cause two otherwise identical points of time to belong to different classes or predict different behavior. This characteristic generally increases the difficulty of analysing them. Existing techniques often depended on hand-crafted features that were expensive to create and required expert knowledge of the field. With the advent of Deep Learning new models of unsupervised learning of features for Time-series analysis and forecast have been developed. Such new developments are the topic of this paper: a review of the main Deep Learning techniques is presented, and some applications on Time-Series analysis are summaried. The results make it clear that Deep Learning has a lot to contribute to the field.

* Written as part of the Seminar on Collaborative Intelligence in the TU Kaiserslautern. January 2016 

  Access Model/Code and Paper
A Recurrent Probabilistic Neural Network with Dimensionality Reduction Based on Time-series Discriminant Component Analysis

Nov 14, 2019
Hideaki Hayashi, Taro Shibanoki, Keisuke Shima, Yuichi Kurita, Toshio Tsuji

This paper proposes a probabilistic neural network developed on the basis of time-series discriminant component analysis (TSDCA) that can be used to classify high-dimensional time-series patterns. TSDCA involves the compression of high-dimensional time series into a lower-dimensional space using a set of orthogonal transformations and the calculation of posterior probabilities based on a continuous-density hidden Markov model with a Gaussian mixture model expressed in the reduced-dimensional space. The analysis can be incorporated into a neural network, which is named a time-series discriminant component network (TSDCN), so that parameters of dimensionality reduction and classification can be obtained simultaneously as network coefficients according to a backpropagation through time-based learning algorithm with the Lagrange multiplier method. The TSDCN is considered to enable high-accuracy classification of high-dimensional time-series patterns and to reduce the computation time taken for network training. The validity of the TSDCN is demonstrated for high-dimensional artificial data and EEG signals in the experiments conducted during the study.

* IEEE Transactions on Neural Networks and Learning Systems, Vol. 26, No.12, pp. 3021-3033, 2015 
* Published in IEEE Transactions on Neural Networks and Learning Systems 

  Access Model/Code and Paper
A Consistent Independence Test for Multivariate Time-Series

Aug 18, 2019
Ronak Mehta, Cencheng Shen, Ting Xu, Joghua T. Vogelstein

A fundamental problem in statistical data analysis is testing whether two phenomena are related. When the phenomena in question are time series, many challenges emerge. The first is defining a dependence measure between time series at the population level, as well as a sample level test statistic. The second is computing or estimating the distribution of this test statistic under the null, as the permutation test procedure is invalid for most time series structures. This work aims to address these challenges by combining distance correlation and multiscale graph correlation (MGC) from independence testing literature and block permutation testing from time series analysis. Two hypothesis tests for testing the independence of time series are proposed. These procedures also characterize whether the dependence relationship between the series is linear or nonlinear, and the time lag at which this dependence is maximized. For strictly stationary auto-regressive moving average (ARMA) processes, the proposed independence tests are proven valid and consistent. Finally, neural connectivity in the brain is analyzed using fMRI data, revealing linear dependence of signals within the visual network and default mode network, and nonlinear relationships in other regions. This work opens up new theoretical and practical directions for many modern time series analysis problems.

* 21 pages, 6 figures 

  Access Model/Code and Paper
The Influence of Global Constraints on Similarity Measures for Time-Series Databases

Dec 25, 2013
Vladimir Kurbalija, Miloš Radovanović, Zoltan Geler, Mirjana Ivanović

A time series consists of a series of values or events obtained over repeated measurements in time. Analysis of time series represents and important tool in many application areas, such as stock market analysis, process and quality control, observation of natural phenomena, medical treatments, etc. A vital component in many types of time-series analysis is the choice of an appropriate distance/similarity measure. Numerous measures have been proposed to date, with the most successful ones based on dynamic programming. Being of quadratic time complexity, however, global constraints are often employed to limit the search space in the matrix during the dynamic programming procedure, in order to speed up computation. Furthermore, it has been reported that such constrained measures can also achieve better accuracy. In this paper, we investigate two representative time-series distance/similarity measures based on dynamic programming, Dynamic Time Warping (DTW) and Longest Common Subsequence (LCS), and the effects of global constraints on them. Through extensive experiments on a large number of time-series data sets, we demonstrate how global constrains can significantly reduce the computation time of DTW and LCS. We also show that, if the constraint parameter is tight enough (less than 10-15% of time-series length), the constrained measure becomes significantly different from its unconstrained counterpart, in the sense of producing qualitatively different 1-nearest neighbor graphs. This observation explains the potential for accuracy gains when using constrained measures, highlighting the need for careful tuning of constraint parameters in order to achieve a good trade-off between speed and accuracy.


  Access Model/Code and Paper
Analysis of Nonstationary Time Series Using Locally Coupled Gaussian Processes

Oct 31, 2016
Luca Ambrogioni, Eric Maris

The analysis of nonstationary time series is of great importance in many scientific fields such as physics and neuroscience. In recent years, Gaussian process regression has attracted substantial attention as a robust and powerful method for analyzing time series. In this paper, we introduce a new framework for analyzing nonstationary time series using locally stationary Gaussian process analysis with parameters that are coupled through a hidden Markov model. The main advantage of this framework is that arbitrary complex nonstationary covariance functions can be obtained by combining simpler stationary building blocks whose hidden parameters can be estimated in closed-form. We demonstrate the flexibility of the method by analyzing two examples of synthetic nonstationary signals: oscillations with time varying frequency and time series with two dynamical states. Finally, we report an example application on real magnetoencephalographic measurements of brain activity.


  Access Model/Code and Paper
Clustering Activity-Travel Behavior Time Series using Topological Data Analysis

Jul 17, 2019
Renjie Chen, Jingyue Zhang, Nalini Ravishanker, Karthik Konduri

Over the last few years, traffic data has been exploding and the transportation discipline has entered the era of big data. It brings out new opportunities for doing data-driven analysis, but it also challenges traditional analytic methods. This paper proposes a new Divide and Combine based approach to do K means clustering on activity-travel behavior time series using features that are derived using tools in Time Series Analysis and Topological Data Analysis. Clustering data from five waves of the National Household Travel Survey ranging from 1990 to 2017 suggests that activity-travel patterns of individuals over the last three decades can be grouped into three clusters. Results also provide evidence in support of recent claims about differences in activity-travel patterns of different survey cohorts. The proposed method is generally applicable and is not limited only to activity-travel behavior analysis in transportation studies. Driving behavior, travel mode choice, household vehicle ownership, when being characterized as categorical time series, can all be analyzed using the proposed method.


  Access Model/Code and Paper
Forecasting with time series imaging

Apr 17, 2019
Xixi Li, Yanfei Kang, Feng Li

Feature-based time series representation has attracted substantial attention in a wide range of time series analysis methods. Recently, the use of time series features for forecast model selection and model averaging has been an emerging research focus in the forecasting community. Nonetheless, most of the existing approaches depend on the manual choice of an appropriate set of features. Exploiting machine learning methods to automatically extract features from time series becomes crucially important in the state-of-the-art time series analysis. In this paper, we introduce an automated approach to extract time series features based on images. Time series are first transformed into recurrence images, from which local features can be extracted using computer vision algorithms. The extracted features are used for forecast model selection and model averaging. Our experiments show that forecasting based on automatically extracted features, with less human intervention and a more comprehensive view of the raw time series data, yields comparable performances with the top best methods proposed in the largest forecasting competition M4.


  Access Model/Code and Paper
Temporal Feature Selection on Networked Time Series

Dec 22, 2016
Haishuai Wang, Jia Wu, Peng Zhang, Chengqi Zhang

This paper formulates the problem of learning discriminative features (\textit{i.e.,} segments) from networked time series data considering the linked information among time series. For example, social network users are considered to be social sensors that continuously generate social signals (tweets) represented as a time series. The discriminative segments are often referred to as \emph{shapelets} in a time series. Extracting shapelets for time series classification has been widely studied. However, existing works on shapelet selection assume that the time series are independent and identically distributed (i.i.d.). This assumption restricts their applications to social networked time series analysis, since a user's actions can be correlated to his/her social affiliations. In this paper we propose a new Network Regularized Least Squares (NetRLS) feature selection model that combines typical time series data and user network data for analysis. Experiments on real-world networked time series Twitter and DBLP data demonstrate the performance of the proposed method. NetRLS performs better than LTS, the state-of-the-art time series feature selection approach, on real-world data.

* submitted to a blind review journal 

  Access Model/Code and Paper
Time Series Clustering via Community Detection in Networks

Aug 19, 2015
Leonardo N. Ferreira, Liang Zhao

In this paper, we propose a technique for time series clustering using community detection in complex networks. Firstly, we present a method to transform a set of time series into a network using different distance functions, where each time series is represented by a vertex and the most similar ones are connected. Then, we apply community detection algorithms to identify groups of strongly connected vertices (called a community) and, consequently, identify time series clusters. Still in this paper, we make a comprehensive analysis on the influence of various combinations of time series distance functions, network generation methods and community detection techniques on clustering results. Experimental study shows that the proposed network-based approach achieves better results than various classic or up-to-date clustering techniques under consideration. Statistical tests confirm that the proposed method outperforms some classic clustering algorithms, such as $k$-medoids, diana, median-linkage and centroid-linkage in various data sets. Interestingly, the proposed method can effectively detect shape patterns presented in time series due to the topological structure of the underlying network constructed in the clustering process. At the same time, other techniques fail to identify such patterns. Moreover, the proposed method is robust enough to group time series presenting similar pattern but with time shifts and/or amplitude variations. In summary, the main point of the proposed method is the transformation of time series from time-space domain to topological domain. Therefore, we hope that our approach contributes not only for time series clustering, but also for general time series analysis tasks.


  Access Model/Code and Paper
Entropic Dynamic Time Warping Kernels for Co-evolving Financial Time Series Analysis

Oct 21, 2019
Lu Bai, Lixin Cui, Lixiang Xu, Yue Wang, Zhihong Zhang, Edwin R. Hancock

In this work, we develop a novel framework to measure the similarity between dynamic financial networks, i.e., time-varying financial networks. Particularly, we explore whether the proposed similarity measure can be employed to understand the structural evolution of the financial networks with time. For a set of time-varying financial networks with each vertex representing the individual time series of a different stock and each edge between a pair of time series representing the absolute value of their Pearson correlation, our start point is to compute the commute time matrix associated with the weighted adjacency matrix of the network structures, where each element of the matrix can be seen as the enhanced correlation value between pairwise stocks. For each network, we show how the commute time matrix allows us to identify a reliable set of dominant correlated time series as well as an associated dominant probability distribution of the stock belonging to this set. Furthermore, we represent each original network as a discrete dominant Shannon entropy time series computed from the dominant probability distribution. With the dominant entropy time series for each pair of financial networks to hand, we develop a similarity measure based on the classical dynamic time warping framework, for analyzing the financial time-varying networks. We show that the proposed similarity measure is positive definite and thus corresponds to a kernel measure on graphs. The proposed kernel bridges the gap between graph kernels and the classical dynamic time warping framework for multiple financial time series analysis. Experiments on time-varying networks extracted through New York Stock Exchange (NYSE) database demonstrate the effectiveness of the proposed approach.

* Previously, the original version of this manuscript appeared as arXiv:1902.09947v2, that was submitted as a replacement by a mistake. Now, that article has been replaced to correct the error, and this manuscript is distinct from that article 

  Access Model/Code and Paper
"The Squawk Bot": Joint Learning of Time Series and Text Data Modalities for Automated Financial Information Filtering

Dec 20, 2019
Xuan-Hong Dang, Syed Yousaf Shah, Petros Zerfos

Multimodal analysis that uses numerical time series and textual corpora as input data sources is becoming a promising approach, especially in the financial industry. However, the main focus of such analysis has been on achieving high prediction accuracy while little effort has been spent on the important task of understanding the association between the two data modalities. Performance on the time series hence receives little explanation though human-understandable textual information is available. In this work, we address the problem of given a numerical time series, and a general corpus of textual stories collected in the same period of the time series, the task is to timely discover a succinct set of textual stories associated with that time series. Towards this goal, we propose a novel multi-modal neural model called MSIN that jointly learns both numerical time series and categorical text articles in order to unearth the association between them. Through multiple steps of data interrelation between the two data modalities, MSIN learns to focus on a small subset of text articles that best align with the performance in the time series. This succinct set is timely discovered and presented as recommended documents, acting as automated information filtering, for the given time series. We empirically evaluate the performance of our model on discovering relevant news articles for two stock time series from Apple and Google companies, along with the daily news articles collected from the Thomson Reuters over a period of seven consecutive years. The experimental results demonstrate that MSIN achieves up to 84.9% and 87.2% in recalling the ground truth articles respectively to the two examined time series, far more superior to state-of-the-art algorithms that rely on conventional attention mechanism in deep learning.


  Access Model/Code and Paper
Constructing Time Series Shape Association Measures: Minkowski Distance and Data Standardization

May 20, 2014
Ildar Batyrshin

It is surprising that last two decades many works in time series data mining and clustering were concerned with measures of similarity of time series but not with measures of association that can be used for measuring possible direct and inverse relationships between time series. Inverse relationships can exist between dynamics of prices and sell volumes, between growth patterns of competitive companies, between well production data in oilfields, between wind velocity and air pollution concentration etc. The paper develops a theoretical basis for analysis and construction of time series shape association measures. Starting from the axioms of time series shape association measures it studies the methods of construction of measures satisfying these axioms. Several general methods of construction of such measures suitable for measuring time series shape similarity and shape association are proposed. Time series shape association measures based on Minkowski distance and data standardization methods are considered. The cosine similarity and the Pearsons correlation coefficient are obtained as particular cases of the proposed general methods that can be used also for construction of new association measures in data analysis.

* Published in Proceedings of BRICS CCI 2013, Porto de Galinhas, Brasil, 8-11 September 2013 
* Presented at BRICS CCI 2013, Porto de Galinhas, Brasil, 8-11 September 2013. Reference on Proceedings of BRICS CCI 2013 is added 

  Access Model/Code and Paper
NonSTOP: A NonSTationary Online Prediction Method for Time Series

Aug 26, 2018
Christopher Xie, Avleen Bijral, Juan Lavista Ferres

We present online prediction methods for time series that let us explicitly handle nonstationary artifacts (e.g. trend and seasonality) present in most real time series. Specifically, we show that applying appropriate transformations to such time series before prediction can lead to improved theoretical and empirical prediction performance. Moreover, since these transformations are usually unknown, we employ the learning with experts setting to develop a fully online method (NonSTOP-NonSTationary Online Prediction) for predicting nonstationary time series. This framework allows for seasonality and/or other trends in univariate time series and cointegration in multivariate time series. Our algorithms and regret analysis subsume recent related work while significantly expanding the applicability of such methods. For all the methods, we provide sub-linear regret bounds using relaxed assumptions. The theoretical guarantees do not fully capture the benefits of the transformations, thus we provide a data-dependent analysis of the follow-the-leader algorithm that provides insight into the success of using such transformations. We support all of our results with experiments on simulated and real data.


  Access Model/Code and Paper