Although aviation accidents are rare, safety incidents occur more frequently and require a careful analysis to detect and mitigate risks in a timely manner. Analyzing safety incidents using operational data and producing event-based explanations is invaluable to airline companies as well as to governing organizations such as the Federal Aviation Administration (FAA) in the United States. However, this task is challenging because of the complexity involved in mining multi-dimensional heterogeneous time series data, the lack of time-step-wise annotation of events in a flight, and the lack of scalable tools to perform analysis over a large number of events. In this work, we propose a precursor mining algorithm that identifies events in the multidimensional time series that are correlated with the safety incident. Precursors are valuable to systems health and safety monitoring and in explaining and forecasting safety incidents. Current methods suffer from poor scalability to high dimensional time series data and are inefficient in capturing temporal behavior. We propose an approach by combining multiple-instance learning (MIL) and deep recurrent neural networks (DRNN) to take advantage of MIL's ability to learn using weakly supervised data and DRNN's ability to model temporal behavior. We describe the algorithm, the data, the intuition behind taking a MIL approach, and a comparative analysis of the proposed algorithm with baseline models. We also discuss the application to a real-world aviation safety problem using data from a commercial airline company and discuss the model's abilities and shortcomings, with some final remarks about possible deployment directions.
Analysis of time-series data allows to identify long-term trends and make predictions that can help to improve our lives. With the rapid development of artificial neural networks, long short-term memory (LSTM) recurrent neural network (RNN) configuration is found to be capable in dealing with time-series forecasting problems where data points are time-dependent and possess seasonality trends. Gated structure of LSTM cell and flexibility in network topology (one-to-many, many-to-one, etc.) allows to model systems with multiple input variables and control several parameters such as the size of the look-back window to make a prediction and number of time steps to be predicted. These make LSTM attractive tool over conventional methods such as autoregression models, the simple average, moving average, naive approach, ARIMA, Holt's linear trend method, Holt's Winter seasonal method, and others. In this paper, we propose a hardware implementation of LSTM network architecture for time-series forecasting problem. All simulations were performed using TSMC 0.18um CMOS technology and HP memristor model.
The number of end devices that use the last mile wireless connectivity is dramatically increasing with the rise of smart infrastructures and require reliable functioning to support smooth and efficient business processes. To efficiently manage such massive wireless networks, more advanced and accurate network monitoring and malfunction detection solutions are required. In this paper, we perform a first time analysis of image-based representation techniques for wireless anomaly detection using recurrence plots and Gramian angular fields and propose a new deep learning architecture enabling accurate anomaly detection. We examine the relative performance of the proposed model and show that the image transformation of time series improves the performance of anomaly detection by up to 29% for binary classification and by up to 27% for multiclass classification. At the same time, the best performing model based on recurrence plot transformation leads to up to 55% increase compared to the state of the art where classical machine learning techniques are used. We also provide insights for the decisions of the classifier using an instance based approach enabled by insights into guided back-propagation. Our results demonstrate the potential of transformation of time series signals to images to improve classification performance compared to classification on raw time series data.
In this paper, the generalized regression neural network is used to predict the GNSS position time series. Using the IGS 24-hour final solution data for Bad Hamburg permanent GNSS station in Germany, it is shown that the larger the training of the network, the higher the accuracy is, regardless of the time span of the time series. In order to analyze the performance of the neural network in various conditions, 14 permanent stations are used in different countries, namely, Spain, France, Romania, Poland, Russian Federation, United Kingdom, Czech Republic, Sweden, Ukraine, Italy, Finland, Slovak Republic, Cyprus, and Greece. The performance analysis is divided into two parts, continuous data-without gaps-and discontinuous ones-having intervals of gaps with no data available. Three measure of error are presented, namely, symmetric mean absolute percentage error, standard deviation, and mean of absolute errors. It is shown that for discontinuous data the position can be predicted with an accuracy of up to 6 centimeters, while the continuous data positions present a higher prediction accuracy, as high as 3 centimeters. In order to compare the results of this machine learning algorithm with the traditional statistical approaches, the Theta method is used, which is well-established for high-accuracy time series prediction. The comparison shows that the generalized regression neural network machine learning algorithm presents better accuracy than the Theta method, possibly up to 250 times. In addition, it is approximately 4.6 times faster.
Behavioural symptoms and urinary tract infections (UTI) are among the most common problems faced by people with dementia. One of the key challenges in the management of these conditions is early detection and timely intervention in order to reduce distress and avoid unplanned hospital admissions. Using in-home sensing technologies and machine learning models for sensor data integration and analysis provides opportunities to detect and predict clinically significant events and changes in health status. We have developed an integrated platform to collect in-home sensor data and performed an observational study to apply machine learning models for agitation and UTI risk analysis. We collected a large dataset from 88 participants with a mean age of 82 and a standard deviation of 6.5 (47 females and 41 males) to evaluate a new deep learning model that utilises attention and rational mechanism. The proposed solution can process a large volume of data over a period of time and extract significant patterns in a time-series data (i.e. attention) and use the extracted features and patterns to train risk analysis models (i.e. rational). The proposed model can explain the predictions by indicating which time-steps and features are used in a long series of time-series data. The model provides a recall of 91\% and precision of 83\% in detecting the risk of agitation and UTIs. This model can be used for early detection of conditions such as UTIs and managing of neuropsychiatric symptoms such as agitation in association with initial treatment and early intervention approaches. In our study we have developed a set of clinical pathways for early interventions using the alerts generated by the proposed model and a clinical monitoring team has been set up to use the platform and respond to the alerts according to the created intervention plans.
How is popularity gained online? Is being successful strictly related to rapidly becoming viral in an online platform or is it possible to acquire popularity in a steady and disciplined fashion? What are other temporal characteristics that can unveil the popularity of online content? To answer these questions, we leverage a multi-faceted temporal analysis of the evolution of popular online contents. Here, we present dipm-SC: a multi-dimensional shape-based time-series clustering algorithm with a heuristic to find the optimal number of clusters. First, we validate the accuracy of our algorithm on synthetic datasets generated from benchmark time series models. Second, we show that dipm-SC can uncover meaningful clusters of popularity behaviors in a real-world Twitter dataset. By clustering the multidimensional time-series of the popularity of contents coupled with other domain-specific dimensions, we uncover two main patterns of popularity: bursty and steady temporal behaviors. Moreover, we find that the way popularity is gained over time has no significant impact on the final cumulative popularity.
This article explores the required amount of time series points from a high-speed computer network to accurately estimate the Hurst exponent. The methodology consists in designing an experiment using estimators that are applied to time series addresses resulting from the capture of high-speed network traffic, followed by addressing the minimum amount of point required to obtain in accurate estimates of the Hurst exponent. The methodology addresses the exhaustive analysis of the Hurst exponent considering bias behaviour, standard deviation, and Mean Squared Error using fractional Gaussian noise signals with stationary increases. Our results show that the Whittle estimator successfully estimates the Hurst exponent in series with few points. Based on the results obtained, a minimum length for the time series is empirically proposed. Finally, to validate the results, the methodology is applied to real traffic captures in a high-speed computer network.
Pointwise matches between two time series are of great importance in time series analysis, and dynamic time warping (DTW) is known to provide generally reasonable matches. There are situations where time series alignment should be invariant to scaling and offset in amplitude or where local regions of the considered time series should be strongly reflected in pointwise matches. Two different variants of DTW, affine DTW (ADTW) and regional DTW (RDTW), are proposed to handle scaling and offset in amplitude and provide regional emphasis respectively. Furthermore, ADTW and RDTW can be combined in two different ways to generate alignments that incorporate advantages from both methods, where the affine model can be applied either globally to the entire time series or locally to each region. The proposed alignment methods outperform DTW on specific simulated datasets, and one-nearest-neighbor classifiers using their associated difference measures are competitive with the difference measures associated with state-of-the-art alignment methods on real datasets.
Spatiotemporal traffic data (e.g., link speed/flow) collected from sensor networks can be organized as multivariate time series with additional spatial attributes. A crucial task in analyzing such data is to identify and detect anomalous observations and events from the data with complex spatial and temporal dependencies. Robust Principal Component Analysis (RPCA) is a widely used tool for anomaly detection. However, the traditional RPCA purely relies on the global low-rank assumption while ignoring the local temporal correlations. In light of this, this study proposes a Hankel-structured tensor version of RPCA for anomaly detection in spatiotemporal data. We treat the raw data with anomalies as a multivariate time series matrix (location $\times$ time) and assume the denoised matrix has a low-rank structure. Then we transform the low-rank matrix to a third-order tensor by applying temporal Hankelization. In the end, we decompose the corrupted matrix into a low-rank Hankel tensor and a sparse matrix. With the Hankelization operation, the model can simultaneously capture the global and local spatiotemporal correlations and exhibit more robust performance. We formulate the problem as an optimization problem and use tensor nuclear norm (TNN) to approximate the tensor rank and $l_1$ norm to approximate the sparsity. We develop an efficient solution algorithm based on the Alternating Direction Method of Multipliers (ADMM). Despite having three hyper-parameters, the model is easy to set in practice. We evaluate the proposed method by synthetic data and metro passenger flow time series and the results demonstrate the accuracy of anomaly detection.
In this paper, we develop topological data analysis methods for classification tasks on univariate time series. As an application we perform binary and ternary classification tasks on two public datasets that consist of physiological signals collected under stress and non-stress conditions. We accomplish our goal by using persistent homology to engineer stable topological features after we use a time delay embedding of the signals and perform a subwindowing instead of using windows of fixed length. The combination of methods we use can be applied to any univariate time series and in this application allows us to reduce noise and use long window sizes without incurring an extra computational cost. We then use machine learning models on the features we algorithmically engineered to obtain higher accuracies with fewer features.