Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Time Series Analysis": models, code, and papers

Behavioral Model Inference of Black-box Software using Deep Neural Networks

Jan 13, 2021
Mohammad Jafar Mashhadi, Foozhan Ataiefard, Hadi Hemmati, Niel Walkinshaw

Many software engineering tasks, such as testing, and anomaly detection can benefit from the ability to infer a behavioral model of the software.Most existing inference approaches assume access to code to collect execution sequences. In this paper, we investigate a black-box scenario, where the system under analysis cannot be instrumented, in this granular fashion.This scenario is particularly prevalent with control systems' log analysis in the form of continuous signals. In this situation, an execution trace amounts to a multivariate time-series of input and output signals, where different states of the system correspond to different `phases` in the time-series. The main challenge is to detect when these phase changes take place. Unfortunately, most existing solutions are either univariate, make assumptions on the data distribution, or have limited learning power.Therefore, we propose a hybrid deep neural network that accepts as input a multivariate time series and applies a set of convolutional and recurrent layers to learn the non-linear correlations between signals and the patterns over time.We show how this approach can be used to accurately detect state changes, and how the inferred models can be successfully applied to transfer-learning scenarios, to accurately process traces from different products with similar execution characteristics. Our experimental results on two UAV autopilot case studies indicate that our approach is highly accurate (over 90% F1 score for state classification) and significantly improves baselines (by up to 102% for change point detection).Using transfer learning we also show that up to 90% of the maximum achievable F1 scores in the open-source case study can be achieved by reusing the trained models from the industrial case and only fine tuning them using as low as 5 labeled samples, which reduces the manual labeling effort by 98%.

* 16 pages,8 figures. arXiv admin note: text overlap with arXiv:2008.11856 
Access Paper or Ask Questions

An Energy-concentrated Wavelet Transform for Time Frequency Analysis of Transient Signals

Feb 22, 2022
Haoran Dong, Gang Yu

Transient signals are often composed of a series of modes that have multivalued time-dependent instantaneous frequency (IF), which brings challenges to the development of signal processing technology. Fortunately, the group delay (GD) of such signal can be well expressed as a single valued function of frequency. By considering the frequency-domain signal model, we present a postprocessing method called wavelet transform (WT)-based time-reassigned synchrosqueezing transform (WTSST). Our proposed method embeds a two-dimensional GD operator into a synchrosqueezing framework to generate a time-frequency representation (TFR) of transient signal with high energy concentration and allows to retrieve the whole or part of the signal. The theoretical analyses of the WTSST are provided, including the analysis of GD candidate accuracy and signal reconstruction accuracy. Moreover, based on WTSST, the WT-based time-reassigned multisynchrosqueezing transform (WTMSST) is proposed by introducing a stepwise refinement scheme, which further improves the drawback that the WTSST method is unable to deal with strong frequency-varying signal. Simulation and real signal analysis illustrate that the proposed methods have the capacity to appropriately describe the features of transient signals.

Access Paper or Ask Questions

Similarity Preserving Representation Learning for Time Series Analysis

Mar 09, 2017
Qi Lei, Jinfeng Yi, Roman Vaculin, Lingfei Wu, Inderjit S. Dhillon

A considerable amount of machine learning algorithms take instance-feature matrices as their inputs. As such, they cannot directly analyze time series data due to its temporal nature, usually unequal lengths, and complex properties. This is a great pity since many of these algorithms are effective, robust, efficient, and easy to use. In this paper, we bridge this gap by proposing an efficient representation learning framework that is able to convert a set of time series with equal or unequal lengths to a matrix format. In particular, we guarantee that the pairwise similarities between time series are well preserved after the transformation. The learned feature representation is particularly suitable to the class of learning problems that are sensitive to data similarities. Given a set of $n$ time series, we first construct an $n\times n$ partially observed similarity matrix by randomly sampling $O(n \log n)$ pairs of time series and computing their pairwise similarities. We then propose an extremely efficient algorithm that solves a highly non-convex and NP-hard problem to learn new features based on the partially observed similarity matrix. We use the learned features to conduct experiments on both data classification and clustering tasks. Our extensive experimental results demonstrate that the proposed framework is both effective and efficient.

Access Paper or Ask Questions

An analysis of deep neural networks for predicting trends in time series data

Sep 16, 2020
Kouame Hermann Kouassi, Deshendran Moodley

The emergence of small and portable smart sensors have opened up new opportunities for many applications, including automated factories, smart cities and connected healthcare, broadly referred to as the "Internet of Things (IoT)". These devices produce time series data. While deep neural networks (DNNs) has been widely applied to computer vision, natural language processing and speech recognition, there is limited research on DNNs for time series prediction. Machine learning (ML) applications for time series prediction has traditionally involved predicting the next value in the series. However, in certain applications, segmenting the time series into a sequence of trends and predicting the next trend is preferred. Recently, a hybrid DNN algorithm, TreNet was proposed for trend prediction. TreNet, which combines an LSTM that takes in trendlines and a CNN that takes in point data was shown to have superior performance for trend prediction when compared to other approaches. However, the study used a standard cross-validation method which does not take into account the sequential nature of time series. In this work, we reproduce TreNet using a walk-forward validation method, which is more appropriate to time series data. We compare the performance of the hybrid TreNet algorithm, on the same three data sets used in the original study, to vanilla MLP, LSTM, and CNN that take in point data, and also to traditional ML algorithms, i.e. the Random Forest (RF), Support Vector Regression and Gradient Boosting Machine. Our results differ significantly from those reported for the original TreNet. In general TreNet still performs better than the vanilla DNN models, but not substantially so as reported for the original TreNet. Furthermore, our results show that the RF algorithm performed substantially better than TreNet on the methane data set.

Access Paper or Ask Questions

Product Reservoir Computing: Time-Series Computation with Multiplicative Neurons

Apr 26, 2015
Alireza Goudarzi, Alireza Shabani, Darko Stefanovic

Echo state networks (ESN), a type of reservoir computing (RC) architecture, are efficient and accurate artificial neural systems for time series processing and learning. An ESN consists of a core of recurrent neural networks, called a reservoir, with a small number of tunable parameters to generate a high-dimensional representation of an input, and a readout layer which is easily trained using regression to produce a desired output from the reservoir states. Certain computational tasks involve real-time calculation of high-order time correlations, which requires nonlinear transformation either in the reservoir or the readout layer. Traditional ESN employs a reservoir with sigmoid or tanh function neurons. In contrast, some types of biological neurons obey response curves that can be described as a product unit rather than a sum and threshold. Inspired by this class of neurons, we introduce a RC architecture with a reservoir of product nodes for time series computation. We find that the product RC shows many properties of standard ESN such as short-term memory and nonlinear capacity. On standard benchmarks for chaotic prediction tasks, the product RC maintains the performance of a standard nonlinear ESN while being more amenable to mathematical analysis. Our study provides evidence that such networks are powerful in highly nonlinear tasks owing to high-order statistics generated by the recurrent product node reservoir.

Access Paper or Ask Questions

Building Models for Biopathway Dynamics Using Intrinsic Dimensionality Analysis

Nov 03, 2018
Emilia M. Wysocka, Valery Dzutsati, Tirthankar Bandyopadhyay, Laura Condon, Sahil Garg

An important task for many if not all the scientific domains is efficient knowledge integration, testing and codification. It is often solved with model construction in a controllable computational environment. In spite of that, the throughput of in-silico simulation-based observations become similarly intractable for thorough analysis. This is especially the case in molecular biology, which served as a subject for this study. In this project, we aimed to test some approaches developed to deal with the curse of dimensionality. Among these we found dimension reduction techniques especially appealing. They can be used to identify irrelevant variability and help to understand critical processes underlying high-dimensional datasets. Additionally, we subjected our data sets to nonlinear time series analysis, as those are well established methods for results comparison. To investigate the usefulness of dimension reduction methods, we decided to base our study on a concrete sample set. The example was taken from the domain of systems biology concerning dynamic evolution of sub-cellular signaling. Particularly, the dataset relates to the yeast pheromone pathway and is studied in-silico with a stochastic model. The model reconstructs signal propagation stimulated by a mating pheromone. In the paper, we elaborate on the reason of multidimensional analysis problem in the context of molecular signaling, and next, we introduce the model of choice, simulation details and obtained time series dynamics. A description of used methods followed by a discussion of results and their biological interpretation finalize the paper.

* Presented in Santa Fe Complex Systems Summer School (CSSS) 2015 
Access Paper or Ask Questions

Investigating echo state networks dynamics by means of recurrence analysis

Apr 24, 2016
Filippo Maria Bianchi, Lorenzo Livi, Cesare Alippi

In this paper, we elaborate over the well-known interpretability issue in echo state networks. The idea is to investigate the dynamics of reservoir neurons with time-series analysis techniques taken from research on complex systems. Notably, we analyze time-series of neuron activations with Recurrence Plots (RPs) and Recurrence Quantification Analysis (RQA), which permit to visualize and characterize high-dimensional dynamical systems. We show that this approach is useful in a number of ways. First, the two-dimensional representation offered by RPs provides a way for visualizing the high-dimensional dynamics of a reservoir. Our results suggest that, if the network is stable, reservoir and input denote similar line patterns in the respective RPs. Conversely, the more unstable the ESN, the more the RP of the reservoir presents instability patterns. As a second result, we show that the $\mathrm{L_{max}}$ measure is highly correlated with the well-established maximal local Lyapunov exponent. This suggests that complexity measures based on RP diagonal lines distribution provide a valuable tool to quantify the degree of network stability. Finally, our analysis shows that all RQA measures fluctuate on the proximity of the so-called edge of stability, where an ESN typically achieves maximum computational capability. We verify that the determination of the edge of stability provided by such RQA measures is more accurate than two well-known criteria based on the Jacobian matrix of the reservoir. Therefore, we claim that RPs and RQA-based analyses can be used as valuable tools to design an effective network given a specific problem.

* Revised version. 24 pages; 12 figures 
Access Paper or Ask Questions

Phenotyping OSA: a time series analysis using fuzzy clustering and persistent homology

Apr 27, 2021
Prachi Loliencar, Giseon Heo

Sleep apnea is a disorder that has serious consequences for the pediatric population. There has been recent concern that traditional diagnosis of the disorder using the apnea-hypopnea index may be ineffective in capturing its multi-faceted outcomes. In this work, we take a first step in addressing this issue by phenotyping patients using a clustering analysis of airflow time series. This is approached in three ways: using feature-based fuzzy clustering in the time and frequency domains, and using persistent homology to study the signal from a topological perspective. The fuzzy clusters are analyzed in a novel manner using a Dirichlet regression analysis, while the topological approach leverages Takens embedding theorem to study the periodicity properties of the signals.

Access Paper or Ask Questions

Loss-analysis via Attention-scale for Physiologic Time Series

Nov 08, 2020
Jiawei Yang, Jeffrey M. Hausdorff

Physiologic signals have properties across multiple spatial and temporal scales, which can be shown by the complexity-analysis of the coarse-grained physiologic signals by scaling techniques such as the multiscale. Unfortunately, the results obtained from the coarse-grained signals by the multiscale may not fully reflect the properties of the original signals because there is a loss caused by scaling techniques and the same scaling technique may bring different losses to different signals. Another problem is that multiscale does not consider the key observations inherent in the signal. Here, we show a new analysis method for time series called the loss-analysis via attention-scale. We show that multiscale is a special case of attention-scale. The loss-analysis can complement to the complexity-analysis to capture aspects of the signals that are not captured using previously developed measures. This can be used to study ageing, diseases, and other physiologic phenomenon.

Access Paper or Ask Questions