Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Time Series Analysis": models, code, and papers

Empirical Analysis of Lifelog Data using Optimal Feature Selection based Unsupervised Logistic Regression (OFS-ULR) Model with Spark Streaming

Apr 12, 2022
Sadhana Tiwari, Sonali Agarwal

Recent advancement in the field of pervasive healthcare monitoring systems causes the generation of a huge amount of lifelog data in real-time. Chronic diseases are one of the most serious health challenges in developing and developed countries. According to WHO, this accounts for 73% of all deaths and 60% of the global burden of diseases. Chronic disease classification models are now harnessing the potential of lifelog data to explore better healthcare practices. This paper is to construct an optimal feature selection-based unsupervised logistic regression model (OFS-ULR) to classify chronic diseases. Since lifelog data analysis is crucial due to its sensitive nature; thus the conventional classification models show limited performance. Therefore, designing new classifiers for the classification of chronic diseases using lifelog data is the need of the age. The vital part of building a good model depends on pre-processing of the dataset, identifying important features, and then training a learning algorithm with suitable hyper parameters for better performance. The proposed approach improves the performance of existing methods using a series of steps such as (i) removing redundant or invalid instances, (ii) making the data labelled using clustering and partitioning the data into classes, (iii) identifying the suitable subset of features by applying either some domain knowledge or selection algorithm, (iv) hyper parameter tuning for models to get best results, and (v) performance evaluation using Spark streaming environment. For this purpose, two-time series datasets are used in the experiment to compute the accuracy, recall, precision, and f1-score. The experimental analysis proves the suitability of the proposed approach as compared to the conventional classifiers and our newly constructed model achieved highest accuracy and reduced training complexity among all among all.

* Data analytics and Machine learning in healthcare 
Access Paper or Ask Questions

Prediction of adverse events in Afghanistan: regression analysis of time series data grouped not by geographic dependencies

Feb 27, 2020
Krzysztof Fiok, Waldemar Karwowski, Maciej Wilamowski

The aim of this study was to approach a difficult regression task on highly unbalanced data regarding active theater of war in Afghanistan. Our focus was set on predicting the negative events number without distinguishing precise nature of the events given historical data on investment and negative events per each of predefined 400 Afghanistan districts. In contrast with previous research on the matter, we propose an approach to analysis of time series data that benefits from non-conventional aggregation of these territorial entities. By carrying out initial exploratory data analysis we demonstrate that dividing data according to our proposal allows to identify strong trend and seasonal components in the selected target variable. Utilizing this approach we also tried to estimate which data regarding investments is most important for prediction performance. Based on our exploratory analysis and previous research we prepared 5 sets of independent variables that were fed to 3 machine learning regression models. The results expressed by mean absolute and mean square errors indicate that leveraging historical data regarding target variable allows for reasonable performance, however unfortunately other proposed independent variables does not seem to improve prediction quality.

* 10 pages, 4 figures, 5 tables 
Access Paper or Ask Questions

T-WaveNet: Tree-Structured Wavelet Neural Network for Sensor-Based Time Series Analysis

Dec 10, 2020
Minhao Liu, Ailing Zeng, Qiuxia Lai, Qiang Xu

Sensor-based time series analysis is an essential task for applications such as activity recognition and brain-computer interface. Recently, features extracted with deep neural networks (DNNs) are shown to be more effective than conventional hand-crafted ones. However, most of these solutions rely solely on the network to extract application-specific information carried in the sensor data. Motivated by the fact that usually a small subset of the frequency components carries the primary information for sensor data, we propose a novel tree-structured wavelet neural network for sensor data analysis, namely \emph{T-WaveNet}. To be specific, with T-WaveNet, we first conduct a power spectrum analysis for the sensor data and decompose the input signal into various frequency subbands accordingly. Then, we construct a tree-structured network, and each node on the tree (corresponding to a frequency subband) is built with an invertible neural network (INN) based wavelet transform. By doing so, T-WaveNet provides more effective representation for sensor information than existing DNN-based techniques, and it achieves state-of-the-art performance on various sensor datasets, including UCI-HAR for activity recognition, OPPORTUNITY for gesture recognition, BCICIV2a for intention recognition, and NinaPro DB1 for muscular movement recognition.

Access Paper or Ask Questions

Wavelet Selection and Employment for Side-Channel Disassembly

Jul 25, 2021
Random Gwinn, Mark A. Matties, Aviel D. Rubin

Side-channel analysis, originally used in cryptanalysis is growing in use cases, both offensive and defensive. Wavelet analysis is a commonly employed time-frequency analysis technique used across disciplines, with a variety of purposes, and has shown increasing prevalence within side-channel literature. This paper explores wavelet selection and analysis parameters for use in side-channel analysis, particularly power side-channel-based instruction disassembly and classification. Experiments are conducted on an ATmega328P microcontroller and a subset of the AVR instruction set. Classification performance is evaluated with a time-series convolutional neural network (CNN) at clock-cycle fidelity. This work demonstrates that wavelet selection and employment parameters have meaningful impact on analysis outcomes. Practitioners should make informed decisions and consider optimizing these factors similarly to machine learning architecture and hyperparameters. We conclude that the gaus1 wavelet with scales 1-21 and grayscale colormap provided the best balance of classification performance, time, and memory efficiency in our application.

* 9 pages, 8 figures, 4 tables 
Access Paper or Ask Questions

Deep Learning for Real-time Gravitational Wave Detection and Parameter Estimation: Results with Advanced LIGO Data

Nov 08, 2017
Daniel George, E. A. Huerta

The recent Nobel-prize-winning detections of gravitational waves from merging black holes and the subsequent detection of the collision of two neutron stars in coincidence with electromagnetic observations have inaugurated a new era of multimessenger astrophysics. To enhance the scope of this emergent field of science, we pioneered the use of deep learning with convolutional neural networks, that take time-series inputs, for rapid detection and characterization of gravitational wave signals. This approach, Deep Filtering, was initially demonstrated using simulated LIGO noise. In this article, we present the extension of Deep Filtering using real data from LIGO, for both detection and parameter estimation of gravitational waves from binary black hole mergers using continuous data streams from multiple LIGO detectors. We demonstrate for the first time that machine learning can detect and estimate the true parameters of real events observed by LIGO. Our results show that Deep Filtering achieves similar sensitivities and lower errors compared to matched-filtering while being far more computationally efficient and more resilient to glitches, allowing real-time processing of weak time-series signals in non-stationary non-Gaussian noise with minimal resources, and also enables the detection of new classes of gravitational wave sources that may go unnoticed with existing detection algorithms. This unified framework for data analysis is ideally suited to enable coincident detection campaigns of gravitational waves and their multimessenger counterparts in real-time.

* Physics Letters B, 778 (2018) 64-70 
* 6 pages, 7 figures; First application of deep learning to real LIGO events; Includes direct comparison against matched-filtering 
Access Paper or Ask Questions

Wavelet-based clustering for time-series trend detection

Nov 17, 2020
Vincent Talbo, Mehdi Haddab, Derek Aubert, Redha Moulla

In this paper, we introduce a method performing clustering of time-series on the basis of their trend (increasing, stagnating/decreasing, and seasonal behavior). The clustering is performed using $k$-means method on a selection of coefficients obtained by discrete wavelet transform, reducing drastically the dimensionality. The method is applied on an use case for the clustering of a 864 daily sales revenue time-series for 61 retail shops. The results are presented for different mother wavelets. The importance of each wavelet coefficient and its level is discussed thanks to a principal component analysis along with a reconstruction of the signal from the selected wavelet coefficients.

* 10 pages, 11 figures 
Access Paper or Ask Questions

Efficiently Discovering Frequent Motifs in Large-scale Sensor Data

Jan 02, 2015
Puneet Agarwal, Gautam Shroff, Sarmimala Saikia, Zaigham Khan

While analyzing vehicular sensor data, we found that frequently occurring waveforms could serve as features for further analysis, such as rule mining, classification, and anomaly detection. The discovery of waveform patterns, also known as time-series motifs, has been studied extensively; however, available techniques for discovering frequently occurring time-series motifs were found lacking in either efficiency or quality: Standard subsequence clustering results in poor quality, to the extent that it has even been termed 'meaningless'. Variants of hierarchical clustering using techniques for efficient discovery of 'exact pair motifs' find high-quality frequent motifs, but at the cost of high computational complexity, making such techniques unusable for our voluminous vehicular sensor data. We show that good quality frequent motifs can be discovered using bounded spherical clustering of time-series subsequences, which we refer to as COIN clustering, with near linear complexity in time-series size. COIN clustering addresses many of the challenges that previously led to subsequence clustering being viewed as meaningless. We describe an end-to-end motif-discovery procedure using a sequence of pre and post-processing techniques that remove trivial-matches and shifted-motifs, which also plagued previous subsequence-clustering approaches. We demonstrate that our technique efficiently discovers frequent motifs in voluminous vehicular sensor data as well as in publicly available data sets.

* 13 pages, 8 figures, Technical Report 
Access Paper or Ask Questions

National-scale electricity peak load forecasting: Traditional, machine learning, or hybrid model?

Jun 30, 2021
Juyong Lee, Youngsang Cho

As the volatility of electricity demand increases owing to climate change and electrification, the importance of accurate peak load forecasting is increasing. Traditional peak load forecasting has been conducted through time series-based models; however, recently, new models based on machine or deep learning are being introduced. This study performs a comparative analysis to determine the most accurate peak load-forecasting model for Korea, by comparing the performance of time series, machine learning, and hybrid models. Seasonal autoregressive integrated moving average with exogenous variables (SARIMAX) is used for the time series model. Artificial neural network (ANN), support vector regression (SVR), and long short-term memory (LSTM) are used for the machine learning models. SARIMAX-ANN, SARIMAX-SVR, and SARIMAX-LSTM are used for the hybrid models. The results indicate that the hybrid models exhibit significant improvement over the SARIMAX model. The LSTM-based models outperformed the others; the single and hybrid LSTM models did not exhibit a significant performance difference. In the case of Korea's highest peak load in 2019, the predictive power of the LSTM model proved to be greater than that of the SARIMAX-LSTM model. The LSTM, SARIMAX-SVR, and SARIMAX-LSTM models outperformed the current time series-based forecasting model used in Korea. Thus, Korea's peak load-forecasting performance can be improved by including machine learning or hybrid models.

Access Paper or Ask Questions

Local Exceptionality Detection in Time Series Using Subgroup Discovery

Aug 05, 2021
Dan Hudson, Travis J. Wiltshire, Martin Atzmueller

In this paper, we present a novel approach for local exceptionality detection on time series data. This method provides the ability to discover interpretable patterns in the data, which can be used to understand and predict the progression of a time series. This being an exploratory approach, the results can be used to generate hypotheses about the relationships between the variables describing a specific process and its dynamics. We detail our approach in a concrete instantiation and exemplary implementation, specifically in the field of teamwork research. Using a real-world dataset of team interactions we include results from an example data analytics application of our proposed approach, showcase novel analysis options, and discuss possible implications of the results from the perspective of teamwork research.

Access Paper or Ask Questions

Specific Differential Entropy Rate Estimation for Continuous-Valued Time Series

Jun 08, 2016
David Darmon

We introduce a method for quantifying the inherent unpredictability of a continuous-valued time series via an extension of the differential Shannon entropy rate. Our extension, the specific entropy rate, quantifies the amount of predictive uncertainty associated with a specific state, rather than averaged over all states. We relate the specific entropy rate to popular `complexity' measures such as Approximate and Sample Entropies. We provide a data-driven approach for estimating the specific entropy rate of an observed time series. Finally, we consider three case studies of estimating specific entropy rate from synthetic and physiological data relevant to the analysis of heart rate variability.

* Entropy 18.5 (2016): 190 
Access Paper or Ask Questions