Alert button

"Time Series Analysis": models, code, and papers
Alert button

Analysis and modeling to forecast in time series: a systematic review

Mar 31, 2021
Fatoumata Dama, Christine Sinoquet

Figure 1 for Analysis and modeling to forecast in time series: a systematic review
Figure 2 for Analysis and modeling to forecast in time series: a systematic review
Figure 3 for Analysis and modeling to forecast in time series: a systematic review
Figure 4 for Analysis and modeling to forecast in time series: a systematic review

This paper surveys state-of-the-art methods and models dedicated to time series analysis and modeling, with the final aim of prediction. This review aims to offer a structured and comprehensive view of the full process flow, and encompasses time series decomposition, stationary tests, modeling and forecasting. Besides, to meet didactic purposes, a unified presentation has been adopted throughout this survey, to present decomposition frameworks on the one hand and linear and nonlinear time series models on the other hand. First, we decrypt the relationships between stationarity and linearity, and further examine the main classes of methods used to test for weak stationarity. Next, the main frameworks for time series decomposition are presented in a unified way: depending on the time series, a more or less complex decomposition scheme seeks to obtain nonstationary effects (the deterministic components) and a remaining stochastic component. An appropriate modeling of the latter is a critical step to guarantee prediction accuracy. We then present three popular linear models, together with two more flexible variants of the latter. A step further in model complexity, and still in a unified way, we present five major nonlinear models used for time series. Amongst nonlinear models, artificial neural networks hold a place apart as deep learning has recently gained considerable attention. A whole section is therefore dedicated to time series forecasting relying on deep learning approaches. A final section provides a list of R and Python implementations for the methods, models and tests presented throughout this review. In this document, our intention is to bring sufficient in-depth knowledge, while covering a broad range of models and forecasting methods: this compilation spans from well-established conventional approaches to more recent adaptations of deep learning to time series forecasting.

* 65 pages (including 9 pages with bibliographic references), 14 figures, 6 tables 
Viaarxiv icon

Modelling stellar activity with Gaussian process regression networks

May 13, 2022
J. D. Camacho, J. P. Faria, P. T. P. Viana

Figure 1 for Modelling stellar activity with Gaussian process regression networks
Figure 2 for Modelling stellar activity with Gaussian process regression networks
Figure 3 for Modelling stellar activity with Gaussian process regression networks
Figure 4 for Modelling stellar activity with Gaussian process regression networks

Stellar photospheric activity is known to limit the detection and characterisation of extra-solar planets. In particular, the study of Earth-like planets around Sun-like stars requires data analysis methods that can accurately model the stellar activity phenomena affecting radial velocity (RV) measurements. Gaussian Process Regression Networks (GPRNs) offer a principled approach to the analysis of simultaneous time-series, combining the structural properties of Bayesian neural networks with the non-parametric flexibility of Gaussian Processes. Using HARPS-N solar spectroscopic observations encompassing three years, we demonstrate that this framework is capable of jointly modelling RV data and traditional stellar activity indicators. Although we consider only the simplest GPRN configuration, we are able to describe the behaviour of solar RV data at least as accurately as previously published methods. We confirm the correlation between the RV and stellar activity time series reaches a maximum at separations of a few days, and find evidence of non-stationary behaviour in the time series, associated with an approaching solar activity minimum.

* 28 pages, 23 figures, submitted to MNRAS 
Viaarxiv icon

Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung Nodules Using Time-Series Deep Learning

Mar 30, 2022
Shahab Aslani, Pavan Alluri, Eyjolfur Gudmundsson, Edward Chandy, John McCabe, Anand Devaraj, Carolyn Horst, Sam M Janes, Rahul Chakkara, Arjun Nair, Daniel C Alexander, SUMMIT consortium, Joseph Jacob

Figure 1 for Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung Nodules Using Time-Series Deep Learning
Figure 2 for Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung Nodules Using Time-Series Deep Learning
Figure 3 for Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung Nodules Using Time-Series Deep Learning
Figure 4 for Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung Nodules Using Time-Series Deep Learning

Lung cancer is the leading cause of cancer-related mortality worldwide. Lung cancer screening (LCS) using annual low-dose computed tomography (CT) scanning has been proven to significantly reduce lung cancer mortality by detecting cancerous lung nodules at an earlier stage. Improving risk stratification of malignancy risk in lung nodules can be enhanced using machine/deep learning algorithms. However most existing algorithms: a) have primarily assessed single time-point CT data alone thereby failing to utilize the inherent advantages contained within longitudinal imaging datasets; b) have not integrated into computer models pertinent clinical data that might inform risk prediction; c) have not assessed algorithm performance on the spectrum of nodules that are most challenging for radiologists to interpret and where assistance from analytic tools would be most beneficial. Here we show the performance of our time-series deep learning model (DeepCAD-NLM-L) which integrates multi-model information across three longitudinal data domains: nodule-specific, lung-specific, and clinical demographic data. We compared our time-series deep learning model to a) radiologist performance on CTs from the National Lung Screening Trial enriched with the most challenging nodules for diagnosis; b) a nodule management algorithm from a North London LCS study (SUMMIT). Our model demonstrated comparable and complementary performance to radiologists when interpreting challenging lung nodules and showed improved performance (AUC=88\%) against models utilizing single time-point data only. The results emphasise the importance of time-series, multi-modal analysis when interpreting malignancy risk in LCS.

Viaarxiv icon

Residual Networks as Flows of Velocity Fields for Diffeomorphic Time Series Alignment

Jun 22, 2021
Hao Huang, Boulbaba Ben Amor, Xichan Lin, Fan Zhu, Yi Fang

Figure 1 for Residual Networks as Flows of Velocity Fields for Diffeomorphic Time Series Alignment
Figure 2 for Residual Networks as Flows of Velocity Fields for Diffeomorphic Time Series Alignment
Figure 3 for Residual Networks as Flows of Velocity Fields for Diffeomorphic Time Series Alignment
Figure 4 for Residual Networks as Flows of Velocity Fields for Diffeomorphic Time Series Alignment

Non-linear (large) time warping is a challenging source of nuisance in time-series analysis. In this paper, we propose a novel diffeomorphic temporal transformer network for both pairwise and joint time-series alignment. Our ResNet-TW (Deep Residual Network for Time Warping) tackles the alignment problem by compositing a flow of incremental diffeomorphic mappings. Governed by the flow equation, our Residual Network (ResNet) builds smooth, fluid and regular flows of velocity fields and consequently generates smooth and invertible transformations (i.e. diffeomorphic warping functions). Inspired by the elegant Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework, the final transformation is built by the flow of time-dependent vector fields which are none other than the building blocks of our Residual Network. The latter is naturally viewed as an Eulerian discretization schema of the flow equation (an ODE). Once trained, our ResNet-TW aligns unseen data by a single inexpensive forward pass. As we show in experiments on both univariate (84 datasets from UCR archive) and multivariate time-series (MSR Action-3D, Florence-3D and MSR Daily Activity), ResNet-TW achieves competitive performance in joint alignment and classification.

* 19 pages 
Viaarxiv icon

A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models

Apr 17, 2020
Sidra Mehtab, Jaydip Sen

Figure 1 for A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models
Figure 2 for A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models
Figure 3 for A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models
Figure 4 for A Time Series Analysis-Based Stock Price Prediction Using Machine Learning and Deep Learning Models

Prediction of future movement of stock prices has always been a challenging task for the researchers. While the advocates of the efficient market hypothesis (EMH) believe that it is impossible to design any predictive framework that can accurately predict the movement of stock prices, there are seminal work in the literature that have clearly demonstrated that the seemingly random movement patterns in the time series of a stock price can be predicted with a high level of accuracy. Design of such predictive models requires choice of appropriate variables, right transformation methods of the variables, and tuning of the parameters of the models. In this work, we present a very robust and accurate framework of stock price prediction that consists of an agglomeration of statistical, machine learning and deep learning models. We use the daily stock price data, collected at five minutes interval of time, of a very well known company that is listed in the National Stock Exchange (NSE) of India. The granular data is aggregated into three slots in a day, and the aggregated data is used for building and training the forecasting models. We contend that the agglomerative approach of model building that uses a combination of statistical, machine learning, and deep learning approaches, can very effectively learn from the volatile and random movement patterns in a stock price data. We build eight classification and eight regression models based on statistical and machine learning approaches. In addition to these models, a deep learning regression model using a long-and-short-term memory (LSTM) network is also built. Extensive results have been presented on the performance of these models, and the results are critically analyzed.

* NSHM_KOL_2020_SCA_DS_1  
* 46 Pages, 36 Figures, 21 Tables 
Viaarxiv icon

Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation

Aug 02, 2021
Lingbo Liu, Yuying Zhu, Guanbin Li, Ziyi Wu, Lei Bai Liang Lin

Figure 1 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation
Figure 2 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation
Figure 3 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation
Figure 4 for Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation

Metro origin-destination prediction is a crucial yet challenging time-series analysis task in intelligent transportation systems, which aims to accurately forecast two specific types of cross-station ridership, i.e., Origin-Destination (OD) one and Destination-Origin (DO) one. However, complete OD matrices of previous time intervals can not be obtained immediately in online metro systems, and conventional methods only used limited information to forecast the future OD and DO ridership separately. In this work, we proposed a novel neural network module termed Heterogeneous Information Aggregation Machine (HIAM), which fully exploits heterogeneous information of historical data (e.g., incomplete OD matrices, unfinished order vectors, and DO matrices) to jointly learn the evolutionary patterns of OD and DO ridership. Specifically, an OD modeling branch estimates the potential destinations of unfinished orders explicitly to complement the information of incomplete OD matrices, while a DO modeling branch takes DO matrices as input to capture the spatial-temporal distribution of DO ridership. Moreover, a Dual Information Transformer is introduced to propagate the mutual information among OD features and DO features for modeling the OD-DO causality and correlation. Based on the proposed HIAM, we develop a unified Seq2Seq network to forecast the future OD and DO ridership simultaneously. Extensive experiments conducted on two large-scale benchmarks demonstrate the effectiveness of our method for online metro origin-destination prediction.

Viaarxiv icon

Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis

Nov 18, 2018
Weijun Lu, Xiao-Yang Liu, Qingwei Wu, Yue Sun, Anwar Walid

Figure 1 for Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis
Figure 2 for Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis
Figure 3 for Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis
Figure 4 for Transform-Based Multilinear Dynamical System for Tensor Time Series Analysis

We propose a novel multilinear dynamical system (MLDS) in a transform domain, named $\mathcal{L}$-MLDS, to model tensor time series. With transformations applied to a tensor data, the latent multidimensional correlations among the frontal slices are built, and thus resulting in the computational independence in the transform domain. This allows the exact separability of the multi-dimensional problem into multiple smaller LDS problems. To estimate the system parameters, we utilize the expectation-maximization (EM) algorithm to determine the parameters of each LDS. Further, $\mathcal{L}$-MLDSs significantly reduce the model parameters and allows parallel processing. Our general $\mathcal{L}$-MLDS model is implemented based on different transforms: discrete Fourier transform, discrete cosine transform and discrete wavelet transform. Due to the nonlinearity of these transformations, $\mathcal{L}$-MLDS is able to capture the nonlinear correlations within the data unlike the MLDS \cite{rogers2013multilinear} which assumes multi-way linear correlations. Using four real datasets, the proposed $\mathcal{L}$-MLDS is shown to achieve much higher prediction accuracy than the state-of-the-art MLDS and LDS with an equal number of parameters under different noise models. In particular, the relative errors are reduced by $50\% \sim 99\%$. Simultaneously, $\mathcal{L}$-MLDS achieves an exponential improvement in the model's training time than MLDS.

Viaarxiv icon

The Connection between Discrete- and Continuous-Time Descriptions of Gaussian Continuous Processes

Jan 20, 2021
Federica Ferretti, Victor Chardès, Thierry Mora, Aleksandra M Walczak, Irene Giardina

Figure 1 for The Connection between Discrete- and Continuous-Time Descriptions of Gaussian Continuous Processes
Figure 2 for The Connection between Discrete- and Continuous-Time Descriptions of Gaussian Continuous Processes
Figure 3 for The Connection between Discrete- and Continuous-Time Descriptions of Gaussian Continuous Processes

Learning the continuous equations of motion from discrete observations is a common task in all areas of physics. However, not any discretization of a Gaussian continuous-time stochastic process can be adopted in parametric inference. We show that discretizations yielding consistent estimators have the property of `invariance under coarse-graining', and correspond to fixed points of a renormalization group map on the space of autoregressive moving average (ARMA) models (for linear processes). This result explains why combining differencing schemes for derivatives reconstruction and local-in-time inference approaches does not work for time series analysis of second or higher order stochastic differential equations, even if the corresponding integration schemes may be acceptably good for numerical simulations.

* 5 pages, 2 figures; 13 pages-Supplemental Material 
Viaarxiv icon

Feature space approximation for kernel-based supervised learning

Nov 25, 2020
Patrick Gelß, Stefan Klus, Ingmar Schuster, Christof Schütte

Figure 1 for Feature space approximation for kernel-based supervised learning
Figure 2 for Feature space approximation for kernel-based supervised learning
Figure 3 for Feature space approximation for kernel-based supervised learning
Figure 4 for Feature space approximation for kernel-based supervised learning

We propose a method for the approximation of high- or even infinite-dimensional feature vectors, which play an important role in supervised learning. The goal is to reduce the size of the training data, resulting in lower storage consumption and computational complexity. Furthermore, the method can be regarded as a regularization technique, which improves the generalizability of learned target functions. We demonstrate significant improvements in comparison to the computation of data-driven predictions involving the full training data set. The method is applied to classification and regression problems from different application areas such as image recognition, system identification, and oceanographic time series analysis.

Viaarxiv icon