Topic:Time Series Analysis
What is Time Series Analysis? Time series analysis comprises statistical methods for analyzing a sequence of data points collected over an interval of time to identify interesting patterns and trends.
Papers and Code
Sep 11, 2024
Abstract:Multimodal sentiment analysis aims to learn representations from different modalities to identify human emotions. However, existing works often neglect the frame-level redundancy inherent in continuous time series, resulting in incomplete modality representations with noise. To address this issue, we propose temporal-invariant learning for the first time, which constrains the distributional variations over time steps to effectively capture long-term temporal dynamics, thus enhancing the quality of the representations and the robustness of the model. To fully exploit the rich semantic information in textual knowledge, we propose a semantic-guided fusion module. By evaluating the correlations between different modalities, this module facilitates cross-modal interactions gated by modality-invariant representations. Furthermore, we introduce a modality discriminator to disentangle modality-invariant and modality-specific subspaces. Experimental results on two public datasets demonstrate the superiority of our model. Our code is available at https://github.com/X-G-Y/SATI.
* change Title, Authors, Abstract
Via
Sep 10, 2024
Abstract:Time series forecasting models typically rely on a fixed-size training set and treat all data uniformly, which may not effectively capture the specific patterns present in more challenging training samples. To address this issue, we introduce AutoTSAug, a learnable data augmentation method based on reinforcement learning. Our approach begins with an empirical analysis to determine which parts of the training data should be augmented. Specifically, we identify the so-called marginal samples by considering the prediction diversity across a set of pretrained forecasting models. Next, we propose using variational masked autoencoders as the augmentation model and applying the REINFORCE algorithm to transform the marginal samples into new data. The goal of this generative model is not only to mimic the distribution of real data but also to reduce the variance of prediction errors across the model zoo. By augmenting the marginal samples with a learnable policy, AutoTSAug substantially improves forecasting performance, advancing the prior art in this field with minimal additional computational cost.
Via
Sep 06, 2024
Abstract:Time series analysis and prediction methods currently excel in quantitative analysis, offering accurate future predictions and diverse statistical indicators, but generally falling short in elucidating the underlying evolution patterns of time series. To gain a more comprehensive understanding and provide insightful explanations, we utilize symbolic regression techniques to derive explicit expressions for the non-linear dynamics in the evolution of time series variables. However, these techniques face challenges in computational efficiency and generalizability across diverse real-world time series data. To overcome these challenges, we propose \textbf{N}eural-\textbf{E}nhanced \textbf{Mo}nte-Carlo \textbf{T}ree \textbf{S}earch (NEMoTS) for time series. NEMoTS leverages the exploration-exploitation balance of Monte-Carlo Tree Search (MCTS), significantly reducing the search space in symbolic regression and improving expression quality. Furthermore, by integrating neural networks with MCTS, NEMoTS not only capitalizes on their superior fitting capabilities to concentrate on more pertinent operations post-search space reduction, but also replaces the complex and time-consuming simulation process, thereby substantially improving computational efficiency and generalizability in time series analysis. NEMoTS offers an efficient and comprehensive approach to time series analysis. Experiments with three real-world datasets demonstrate NEMoTS's significant superiority in performance, efficiency, reliability, and interpretability, making it well-suited for large-scale real-world time series data.
Via
Sep 07, 2024
Abstract:Disturbances such as atmospheric turbulence and aero-optic effects lead to wavefront aberrations, which degrade performance in imaging and laser propagation applications. Adaptive optics (AO) provide a method to mitigate these effects by pre-compensating the wavefront before propagation. However, development and testing of AO systems requires wavefront aberration data, which is difficult and expensive to obtain. Simulation methods can be used to generate such data less expensively. For atmospheric turbulence, the Kolmogorov-Taylor model provides a well-defined power spectrum that can be combined with the well-known angular spectrum method to generate synthetic phase screens. However, as aero-optics cannot be similarly generalized, this process cannot be applied to aero-optically relevant phenomena. In this paper, we introduce ReVAR (Re-Whitened Vector Auto-Regression), a novel algorithm for data-driven aero-optic phase screen generation. ReVAR trains on an input time-series of spatially and temporally correlated wavefront images from experiment and then generates synthetic data that captures the statistics present in the experimental data. The first training step of ReVAR distills the input images to a set of prediction weights and residuals. A further step we call re-whitening uses a spatial principal component analysis (PCA) to whiten these residuals. ReVAR then uses a white noise generator and inverts the previous transformation to construct synthetic time-series of data. This algorithm is computationally efficient, able to generate arbitrarily long synthetic time-series, and produces high-quality results when tested on turbulent boundary layer (TBL) data measured from a wind tunnel experiment. Using measured data for training, the temporal power spectral density (TPSD) of data generated using ReVAR closely matches the TPSD of the experimental data.
* To be published in SPIE Unconventional Imaging, Sensing, and Adaptive
Optics 2024 proceedings
Via
Sep 05, 2024
Abstract:In order to accurately identify the performance status of mobile devices and finely adjust the user experience, a real-time performance perception evaluation method based on TOPSIS (Technique for Order Preference by Similarity to Ideal Solution) combined with entropy weighting method and time series model construction was studied. After collecting the performance characteristics of various mobile devices, the device performance profile was fitted by using PCA (principal component analysis) dimensionality reduction and feature engineering methods such as descriptive time series analysis. The ability of performance features and profiles to describe the real-time performance status of devices was understood and studied by applying the TOPSIS method and multi-level weighting processing. A time series model was constructed for the feature set under objective weighting, and multiple sensitivity (real-time, short-term, long-term) performance status perception results were provided to obtain real-time performance evaluation data and long-term stable performance prediction data. Finally, by configuring dynamic AB experiments and overlaying fine-grained power reduction strategies, the usability of the method was verified, and the accuracy of device performance status identification and prediction was compared with the performance of the profile features including dimensionality reduction time series modeling, TOPSIS method and entropy weighting method, subjective weighting, HMA method. The results show that accurate real-time performance perception results can greatly enhance business value, and this research has application effectiveness and certain forward-looking significance.
Via
Sep 06, 2024
Abstract:Traditional solar flare forecasting approaches have mostly relied on physics-based or data-driven models using solar magnetograms, treating flare predictions as a point-in-time classification problem. This approach has limitations, particularly in capturing the evolving nature of solar activity. Recognizing the limitations of traditional flare forecasting approaches, our research aims to uncover hidden relationships and the evolutionary characteristics of solar flares and their source regions. Our previously proposed Sliding Window Multivariate Time Series Forest (Slim-TSF) has shown the feasibility of usage applied on multivariate time series data. A significant aspect of this study is the comparative analysis of our updated Slim-TSF framework against the original model outcomes. Preliminary findings indicate a notable improvement, with an average increase of 5\% in both the True Skill Statistic (TSS) and Heidke Skill Score (HSS). This enhancement not only underscores the effectiveness of our refined methodology but also suggests that our systematic evaluation and feature selection approach can significantly advance the predictive accuracy of solar flare forecasting models.
* This is a preprint accepted at the 26th International Conference on
Big Data Analytics and Knowledge Discovery (DAWAK 2024)
Via
Sep 05, 2024
Abstract:Cryptocoins (i.e., Bitcoin, Ether, Litecoin) are tradable digital assets. Ownerships of cryptocoins are registered on distributed ledgers (i.e., blockchains). Secure encryption techniques guarantee the security of the transactions (transfers of coins among owners), registered into the ledger. Cryptocoins are exchanged for specific trading prices. The extreme volatility of such trading prices across all different sets of crypto-assets remains undisputed. However, the relations between the trading prices across different cryptocoins remains largely unexplored. Major coin exchanges indicate trend correlation to advise for sells or buys. However, price correlations remain largely unexplored. We shed some light on the trend correlations across a large variety of cryptocoins, by investigating their coin/price correlation trends over the past two years. We study the causality between the trends, and exploit the derived correlations to understand the accuracy of state-of-the-art forecasting techniques for time series modeling (e.g., GBMs, LSTM and GRU) of correlated cryptocoins. Our evaluation shows (i) strong correlation patterns between the most traded coins (e.g., Bitcoin and Ether) and other types of cryptocurrencies, and (ii) state-of-the-art time series forecasting algorithms can be used to forecast cryptocoins price trends. We released datasets and code to reproduce our analysis to the research community.
* DEBS 2023: Proceedings of the 17th ACM International Conference on
Distributed and Event-based Systems
Via
Sep 04, 2024
Abstract:The ability of large language models (LLMs) to interpret visual representations of data is crucial for advancing their application in data analysis and decision-making processes. This paper presents a novel synthetic dataset designed to evaluate the proficiency of LLMs in interpreting various forms of data visualizations, including plots like time series, histograms, violins, boxplots, and clusters. Our dataset is generated using controlled parameters to ensure comprehensive coverage of potential real-world scenarios. We employ multimodal text prompts with questions related to visual data in images to benchmark several state-of-the-art models like ChatGPT or Gemini, assessing their understanding and interpretative accuracy. To ensure data integrity, our benchmark dataset is generated automatically, making it entirely new and free from prior exposure to the models being tested. This strategy allows us to evaluate the models' ability to truly interpret and understand the data, eliminating possibility of pre-learned responses, and allowing for an unbiased evaluation of the models' capabilities. We also introduce quantitative metrics to assess the performance of the models, providing a robust and comprehensive evaluation tool. Benchmarking several state-of-the-art LLMs with this dataset reveals varying degrees of success, highlighting specific strengths and weaknesses in interpreting diverse types of visual data. The results provide valuable insights into the current capabilities of LLMs and identify key areas for improvement. This work establishes a foundational benchmark for future research and development aimed at enhancing the visual interpretative abilities of language models. In the future, improved LLMs with robust visual interpretation skills can significantly aid in automated data analysis, scientific research, educational tools, and business intelligence applications.
Via
Sep 04, 2024
Abstract:This study presents a comprehensive methodology for modeling and forecasting the historical time series of fire spots detected by the AQUA_M-T satellite in the Amazon, Brazil. The approach utilizes a mixed Recurrent Neural Network (RNN) model, combining Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures to predict monthly accumulations of daily detected fire spots. A summary of the data revealed a consistent seasonality over time, with annual maximum and minimum fire spot values tending to repeat at the same periods each year. The primary objective is to verify whether the forecasts capture this inherent seasonality through rigorous statistical analysis. The methodology involved careful data preparation, model configuration, and training using cross-validation with two seeds, ensuring that the data generalizes well to the test and validation sets, and confirming the convergence of the model parameters. The results indicate that the mixed LSTM and GRU model offers improved accuracy in forecasting 12 months ahead, demonstrating its effectiveness in capturing complex temporal patterns and modeling the observed time series. This research significantly contributes to the application of deep learning techniques in environmental monitoring, specifically in fire spot forecasting. In addition to improving forecast accuracy, the proposed approach highlights the potential for adaptation to other time series forecasting challenges, opening new avenues for research and development in machine learning and natural phenomenon prediction. Keywords: Time Series Forecasting, Recurrent Neural Networks, Deep Learning.
* 16 pages, in Portuguese language, 24 figures
Via
Sep 04, 2024
Abstract:Recently, the issue of adversarial robustness in the time series domain has garnered significant attention. However, the available defense mechanisms remain limited, with adversarial training being the predominant approach, though it does not provide theoretical guarantees. Randomized Smoothing has emerged as a standout method due to its ability to certify a provable lower bound on robustness radius under $\ell_p$-ball attacks. Recognizing its success, research in the time series domain has started focusing on these aspects. However, existing research predominantly focuses on time series forecasting, or under the non-$\ell_p$ robustness in statistic feature augmentation for time series classification~(TSC). Our review found that Randomized Smoothing performs modestly in TSC, struggling to provide effective assurances on datasets with poor robustness. Therefore, we propose a self-ensemble method to enhance the lower bound of the probability confidence of predicted labels by reducing the variance of classification margins, thereby certifying a larger radius. This approach also addresses the computational overhead issue of Deep Ensemble~(DE) while remaining competitive and, in some cases, outperforming it in terms of robustness. Both theoretical analysis and experimental results validate the effectiveness of our method, demonstrating superior performance in robustness testing compared to baseline approaches.
* 6 figures, 4 tables, 10 pages
Via