Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Olfa Arfaoui

Dimensionality reduction with missing values imputation

Jul 02, 2017

Rania Mkhinini Gahar, Olfa Arfaoui, Minyar Sassi Hidri, Nejib Ben-Hadj Alouane

Figure 1 for Dimensionality reduction with missing values imputation

Figure 2 for Dimensionality reduction with missing values imputation

Figure 3 for Dimensionality reduction with missing values imputation

Abstract:In this study, we propose a new statical approach for high-dimensionality reduction of heterogenous data that limits the curse of dimensionality and deals with missing values. To handle these latter, we propose to use the Random Forest imputation's method. The main purpose here is to extract useful information and so reducing the search space to facilitate the data exploration process. Several illustrative numeric examples, using data coming from publicly available machine learning repositories are also included. The experimental component of the study shows the efficiency of the proposed analytical approach.

* 6 pages, 2 figures, The first Computer science University of Tunis El Manar, PhD Symposium (CUPS'17), Tunisia, May 22-25, 2017

Via

Access Paper or Ask Questions

Classification non supervisée des données hétérogènes à large échelle

Jul 02, 2017

Mohamed Ali Zoghlami, Olfa Arfaoui, Minyar Sassi Hidri, Rahma Ben Ayed

Figure 1 for Classification non supervisée des données hétérogènes à large échelle

Figure 2 for Classification non supervisée des données hétérogènes à large échelle

Figure 3 for Classification non supervisée des données hétérogènes à large échelle

Figure 4 for Classification non supervisée des données hétérogènes à large échelle

Abstract:When it comes to cluster massive data, response time, disk access and quality of formed classes becoming major issues for companies. It is in this context that we have come to define a clustering framework for large scale heterogeneous data that contributes to the resolution of these issues. The proposed framework is based on, firstly, the descriptive analysis based on MCA, and secondly, the MapReduce paradigm in a large scale environment. The results are encouraging and prove the efficiency of the hybrid deployment on response quality and time component as on qualitative and quantitative data.

* Conf\'erence Internationale Francophone sur la Science de Donn\'ees - Les 23\`emes Rencontres annuelles de la Soci\'et\'e Francophone de Classification (AAFD & SFC), Marrakech, Maroc, pp. 37-42, 2016
* 6 pages, in French, 8 figures

Via

Access Paper or Ask Questions