Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nikolai Gagunashvili

University of Akureyri

Machine learning approach to inverse problem and unfolding procedure

May 25, 2011

Nikolai Gagunashvili

Figure 1 for Machine learning approach to inverse problem and unfolding procedure

Figure 2 for Machine learning approach to inverse problem and unfolding procedure

Figure 3 for Machine learning approach to inverse problem and unfolding procedure

Figure 4 for Machine learning approach to inverse problem and unfolding procedure

Abstract:A procedure for unfolding the true distribution from experimental data is presented. Machine learning methods are applied for simultaneous identification of an apparatus function and solving of an inverse problem. A priori information about the true distribution from theory or previous experiments is used for Monte-Carlo simulation of the training sample. The training sample can be used to calculate a transformation from the true distribution to the measured one. This transformation provides a robust solution for an unfolding problem with minimal biases and statistical errors for the set of distributions used to create the training sample. The dimensionality of the solved problem can be arbitrary. A numerical example is presented to illustrate and validate the procedure.

* 19 pages, 7 figures

Via

Access Paper or Ask Questions

Classifying extremely imbalanced data sets

Nov 29, 2010

Markward Britsch, Nikolai Gagunashvili, Michael Schmelling

Figure 1 for Classifying extremely imbalanced data sets

Figure 2 for Classifying extremely imbalanced data sets

Figure 3 for Classifying extremely imbalanced data sets

Figure 4 for Classifying extremely imbalanced data sets

Abstract:Imbalanced data sets containing much more background than signal instances are very common in particle physics, and will also be characteristic for the upcoming analyses of LHC data. Following up the work presented at ACAT 2008, we use the multivariate technique presented there (a rule growing algorithm with the meta-methods bagging and instance weighting) on much more imbalanced data sets, especially a selection of D0 decays without the use of particle identification. It turns out that the quality of the result strongly depends on the number of background instances used for training. We discuss methods to exploit this in order to improve the results significantly, and how to handle and reduce the size of large training sets without loss of result quality in general. We will also comment on how to take into account statistical fluctuation in receiver operation characteristic curves (ROC) for comparing classifier methods.

* PoS ACAT2010:047,2010

Via

Access Paper or Ask Questions