Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Russel Pears

Fingerprinting Concepts in Data Streams with Supervised and Unsupervised Meta-Information

Mar 11, 2026

Ben Halstead, Yun Sing Koh, Patricia Riddle, Mykola Pechenizkiy, Albert Bifet, Russel Pears

Abstract:Streaming sources of data are becoming more common as the ability to collect data in real-time grows. A major concern in dealing with data streams is concept drift, a change in the distribution of data over time, for example, due to changes in environmental conditions. Representing concepts (stationary periods featuring similar behaviour) is a key idea in adapting to concept drift. By testing the similarity of a concept representation to a window of observations, we can detect concept drift to a new or previously seen recurring concept. Concept representations are constructed using meta-information features, values describing aspects of concept behaviour. We find that previously proposed concept representations rely on small numbers of meta-information features. These representations often cannot distinguish concepts, leaving systems vulnerable to concept drift. We propose FiCSUM, a general framework to represent both supervised and unsupervised behaviours of a concept in a fingerprint, a vector of many distinct meta-information features able to uniquely identify more concepts. Our dynamic weighting strategy learns which meta-information features describe concept drift in a given dataset, allowing a diverse set of meta-information features to be used at once. FiCSUM outperforms state-of-the-art methods over a range of 11 real world and synthetic datasets in both accuracy and modeling underlying concept drift.

Via

Access Paper or Ask Questions

Use of Ensembles of Fourier Spectra in Capturing Recurrent Concepts in Data Streams

Apr 23, 2015

Sripirakas Sakthithasan, Russel Pears, Albert Bifet, Bernhard Pfahringer

Figure 1 for Use of Ensembles of Fourier Spectra in Capturing Recurrent Concepts in Data Streams

Figure 2 for Use of Ensembles of Fourier Spectra in Capturing Recurrent Concepts in Data Streams

Figure 3 for Use of Ensembles of Fourier Spectra in Capturing Recurrent Concepts in Data Streams

Figure 4 for Use of Ensembles of Fourier Spectra in Capturing Recurrent Concepts in Data Streams

Abstract:In this research, we apply ensembles of Fourier encoded spectra to capture and mine recurring concepts in a data stream environment. Previous research showed that compact versions of Decision Trees can be obtained by applying the Discrete Fourier Transform to accurately capture recurrent concepts in a data stream. However, in highly volatile environments where new concepts emerge often, the approach of encoding each concept in a separate spectrum is no longer viable due to memory overload and thus in this research we present an ensemble approach that addresses this problem. Our empirical results on real world data and synthetic data exhibiting varying degrees of recurrence reveal that the ensemble approach outperforms the single spectrum approach in terms of classification accuracy, memory and execution time.

* This paper has been accepted for IJCNN 2015 conference, Ireland

Via

Access Paper or Ask Questions

Mining Recurrent Concepts in Data Streams using the Discrete Fourier Transform

Jun 24, 2014

Sakthithasan Sripirakas, Russel Pears

Figure 1 for Mining Recurrent Concepts in Data Streams using the Discrete Fourier Transform

Figure 2 for Mining Recurrent Concepts in Data Streams using the Discrete Fourier Transform

Figure 3 for Mining Recurrent Concepts in Data Streams using the Discrete Fourier Transform

Abstract:In this research we address the problem of capturing recurring concepts in a data stream environment. Recurrence capture enables the re-use of previously learned classifiers without the need for re-learning while providing for better accuracy during the concept recurrence interval. We capture concepts by applying the Discrete Fourier Transform (DFT) to Decision Tree classifiers to obtain highly compressed versions of the trees at concept drift points in the stream and store such trees in a repository for future use. Our empirical results on real world and synthetic data exhibiting varying degrees of recurrence show that the Fourier compressed trees are more robust to noise and are able to capture recurring concepts with higher precision than a meta learning approach that chooses to re-use classifiers in their originally occurring form.

Via

Access Paper or Ask Questions