Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Time Series Analysis": models, code, and papers

Machine learning methods for modelling and analysis of time series signals in geoinformatics

Sep 16, 2021
Maria Kaselimi

In this dissertation is provided a comparative analysis that evaluates the performance of several deep learning (DL) architectures on a large number of time series datasets of different nature and for different applications. Two main fruitful research fields are discussed here which were strategically chosen in order to address current cross disciplinary research priorities attracting the interest of geodetic community. The first problem is related to ionospheric Total Electron Content (TEC) modeling which is an important issue in many real time Global Navigation System Satellites (GNSS) applications. Reliable and fast knowledge about ionospheric variations becomes increasingly important. GNSS users of single frequency receivers and satellite navigation systems need accurate corrections to remove signal degradation effects caused by the ionosphere. Ionospheric modeling using signal processing techniques is the subject of discussion in the present contribution. The next problem under discussion is energy disaggregation which is an important issue for energy efficiency and energy consumption awareness. Reliable and fast knowledge about residential energy consumption at appliance level becomes increasingly important nowadays and it is an important mitigation measure to prevent energy wastage. Energy disaggregation or Nonintrusive load monitoring (NILM) is a single channel blind source separation problem where the task is to estimate the consumption of each electrical appliance given the total energy consumption. For both problems various deep learning models (DL) are proposed that cover various aspects of the problem under study, whereas experimental results indicate the proposed methods superiority compared to the current state of the art.

* arXiv admin note: text overlap with arXiv:2004.13408 by other authors 
Access Paper or Ask Questions

An Optimized and Energy-Efficient Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks

Nov 26, 2019
Julia El Zini, Yara Rizk, Mariette Awad

Recurrent neural networks (RNN) have been successfully applied to various sequential decision-making tasks, natural language processing applications, and time-series predictions. Such networks are usually trained through back-propagation through time (BPTT) which is prohibitively expensive, especially when the length of the time dependencies and the number of hidden neurons increase. To reduce the training time, extreme learning machines (ELMs) have been recently applied to RNN training, reaching a 99\% speedup on some applications. Due to its non-iterative nature, ELM training, when parallelized, has the potential to reach higher speedups than BPTT. In this work, we present \opt, an optimized parallel RNN training algorithm based on ELM that takes advantage of the GPU shared memory and of parallel QR factorization algorithms to efficiently reach optimal solutions. The theoretical analysis of the proposed algorithm is presented on six RNN architectures, including LSTM and GRU, and its performance is empirically tested on ten time-series prediction applications. \opt~is shown to reach up to 845 times speedup over its sequential counterpart and to require up to 20x less time to train than parallel BPTT.

Access Paper or Ask Questions

Deep Adaptive Input Normalization for Price Forecasting using Limit Order Book Data

Feb 21, 2019
Nikolaos Passalis, Anastasios Tefas, Juho Kanniainen, Moncef Gabbouj, Alexandros Iosifidis

Deep Learning (DL) models can be used to tackle time series analysis tasks with great success. However, the performance of DL models can degenerate rapidly if the data are not appropriately normalized. This issue is even more apparent when DL is used for financial time series forecasting tasks, where the non-stationary and multimodal nature of the data pose significant challenges and severely affect the performance of DL models. In this work, a simple, yet effective, neural layer, that is capable of adaptively normalizing the input time series, while taking into account the distribution of the data, is proposed. The proposed layer is trained in an end-to-end fashion using back-propagation and can lead to significant performance improvements. The effectiveness of the proposed method is demonstrated using a large-scale limit order book dataset.

Access Paper or Ask Questions

Topological Data Analysis in Time Series: Temporal Filtration and Application to Single-Cell Genomics

Apr 29, 2022
Baihan Lin

The absence of a conventional association between the cell-cell cohabitation and its emergent dynamics into cliques during development has hindered our understanding of how cell populations proliferate, differentiate, and compete, i.e. the cell ecology. With the recent advancement of the single-cell RNA-sequencing (RNA-seq), we can potentially describe such a link by constructing network graphs that characterize the similarity of the gene expression profiles of the cell-specific transcriptional programs, and analyzing these graphs systematically using the summary statistics informed by the algebraic topology. We propose the single-cell topological simplicial analysis (scTSA). Applying this approach to the single-cell gene expression profiles from local networks of cells in different developmental stages with different outcomes reveals a previously unseen topology of cellular ecology. These networks contain an abundance of cliques of single-cell profiles bound into cavities that guide the emergence of more complicated habitation forms. We visualize these ecological patterns with topological simplicial architectures of these networks, compared with the null models. Benchmarked on the single-cell RNA-seq data of zebrafish embryogenesis spanning 38,731 cells, 25 cell types and 12 time steps, our approach highlights the gastrulation as the most critical stage, consistent with consensus in developmental biology. As a nonlinear, model-independent, and unsupervised framework, our approach can also be applied to tracing multi-scale cell lineage, identifying critical stages, or creating pseudo-time series.

* Codes at 
Access Paper or Ask Questions

Boosting the kernelized shapelets: Theory and algorithms for local features

Sep 07, 2017
Daiki Suehiro, Kohei Hatano, Eiji Takimoto, Shuji Yamamoto, Kenichi Bannai, Akiko Takeda

We consider binary classification problems using local features of objects. One of motivating applications is time-series classification, where features reflecting some local closeness measure between a time series and a pattern sequence called shapelet are useful. Despite the empirical success of such approaches using local features, the generalization ability of resulting hypotheses is not fully understood and previous work relies on a bunch of heuristics. In this paper, we formulate a class of hypotheses using local features, where the richness of features is controlled by kernels. We derive generalization bounds of sparse ensembles over the class which is exponentially better than a standard analysis in terms of the number of possible local features. The resulting optimization problem is well suited to the boosting approach and the weak learning problem is formulated as a DC program, for which practical algorithms exist. In preliminary experiments on time-series data sets, our method achieves competitive accuracy with the state-of-the-art algorithms with small parameter-tuning cost.

* 16 pages, 1 figures 
Access Paper or Ask Questions

Explaining Outcomes of Multi-Party Dialogues using Causal Learning

May 03, 2021
Priyanka Sinha, Pabitra Mitra, Antonio Anastasio Bruto da Costa, Nikolaos Kekatos

Multi-party dialogues are common in enterprise social media on technical as well as non-technical topics. The outcome of a conversation may be positive or negative. It is important to analyze why a dialogue ends with a particular sentiment from the point of view of conflict analysis as well as future collaboration design. We propose an explainable time series mining algorithm for such analysis. A dialogue is represented as an attributed time series of occurrences of keywords, EMPATH categories, and inferred sentiments at various points in its progress. A special decision tree, with decision metrics that take into account temporal relationships between dialogue events, is used for predicting the cause of the outcome sentiment. Interpretable rules mined from the classifier are used to explain the prediction. Experimental results are presented for the enterprise social media posts in a large company.

Access Paper or Ask Questions

Gaussian Process Conditional Copulas with Applications to Financial Time Series

Jul 01, 2013
José Miguel Hernández-Lobato, James Robert Lloyd, Daniel Hernández-Lobato

The estimation of dependencies between multiple variables is a central problem in the analysis of financial time series. A common approach is to express these dependencies in terms of a copula function. Typically the copula function is assumed to be constant but this may be inaccurate when there are covariates that could have a large influence on the dependence structure of the data. To account for this, a Bayesian framework for the estimation of conditional copulas is proposed. In this framework the parameters of a copula are non-linearly related to some arbitrary conditioning variables. We evaluate the ability of our method to predict time-varying dependencies on several equities and currencies and observe consistent performance gains compared to static copula models and other time-varying copula methods.

Access Paper or Ask Questions

Disentangling Identifiable Features from Noisy Data with Structured Nonlinear ICA

Jun 17, 2021
Hermanni Hälvä, Sylvain Le Corff, Luc Lehéricy, Jonathan So, Yongjie Zhu, Elisabeth Gassiat, Aapo Hyvarinen

We introduce a new general identifiable framework for principled disentanglement referred to as Structured Nonlinear Independent Component Analysis (SNICA). Our contribution is to extend the identifiability theory of deep generative models for a very broad class of structured models. While previous works have shown identifiability for specific classes of time-series models, our theorems extend this to more general temporal structures as well as to models with more complex structures such as spatial dependencies. In particular, we establish the major result that identifiability for this framework holds even in the presence of noise of unknown distribution. The SNICA setting therefore subsumes all the existing nonlinear ICA models for time-series and also allows for new much richer identifiable models. Finally, as an example of our framework's flexibility, we introduce the first nonlinear ICA model for time-series that combines the following very useful properties: it accounts for both nonstationarity and autocorrelation in a fully unsupervised setting; performs dimensionality reduction; models hidden states; and enables principled estimation and inference by variational maximum-likelihood.

* preprint 
Access Paper or Ask Questions

Early Abandoning and Pruning for Elastic Distances

Feb 10, 2021
Matthieu Herrmann, Geoffrey I. Webb

Elastic distances are key tools for time series analysis. Straightforward implementations require O(n2)space and time complexities, preventing many applications from scaling to long series. Much work hasbeen devoted in speeding up these applications, mostly with the development of lower bounds, allowing to avoid costly distance computations when a given threshold is exceeded. This threshold also allows to early abandon the computation of the distance itself. Another approach, developed for DTW, is to prune parts of the computation. All these techniques are orthogonal to each other. In this work, we develop a new generic strategy, "EAPruned", that tightly integrates pruning with early abandoning. We apply it to DTW, CDTW, WDTW, ERP, MSM and TWE, showing substantial speedup in NN1-like scenarios. Pruning also shows substantial speedup for some distances, benefiting applications such as clustering where all pairwise distances are required and hence early abandoning is not applicable. We release our implementation as part of a new C++ library for time series classification, along with easy to usePython/Numpy bindings.

Access Paper or Ask Questions

Topology-based Clusterwise Regression for User Segmentation and Demand Forecasting

Sep 08, 2020
Rodrigo Rivera-Castro, Aleksandr Pletnev, Polina Pilyugina, Grecia Diaz, Ivan Nazarov, Wanyi Zhu, Evgeny Burnaev

Topological Data Analysis (TDA) is a recent approach to analyze data sets from the perspective of their topological structure. Its use for time series data has been limited. In this work, a system developed for a leading provider of cloud computing combining both user segmentation and demand forecasting is presented. It consists of a TDA-based clustering method for time series inspired by a popular managerial framework for customer segmentation and extended to the case of clusterwise regression using matrix factorization methods to forecast demand. Increasing customer loyalty and producing accurate forecasts remain active topics of discussion both for researchers and managers. Using a public and a novel proprietary data set of commercial data, this research shows that the proposed system enables analysts to both cluster their user base and plan demand at a granular level with significantly higher accuracy than a state of the art baseline. This work thus seeks to introduce TDA-based clustering of time series and clusterwise regression with matrix factorization methods as viable tools for the practitioner.

* 2019 IEEE International Conference on Data Science and Advanced Analytics (DSAA) 
Access Paper or Ask Questions