Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Emilien Joly

Conformal Robust Set Estimation

Apr 20, 2026

Alejandro Cholaquidis, Emilien Joly, Leonardo Moreno

Abstract:Conformal prediction provides finite-sample, distribution-free coverage under exchangeability, but standard constructions may lack robustness in the presence of outliers or heavy tails. We propose a robust conformal method based on a non-conformity score defined as the half-mass radius around a point, equivalently the distance to its $(\lfloor n/2\rfloor+1)$-nearest neighbour. We show that the resulting conformal regions are marginally valid for any sample size and converge in probability to a robust population central set defined through a distance-to-a-measure functional. Under mild regularity conditions, we establish exponential concentration and tail bounds that quantify the deviation between the empirical conformal region and its population counterpart. These results provide a probabilistic justification for using robust geometric scores in conformal prediction, even for heavy-tailed or multi-modal distributions.

Via

Access Paper or Ask Questions

GROS: A General Robust Aggregation Strategy

Feb 23, 2024

Alejandro Cholaquidis, Emilien Joly, Leonardo Moreno

Figure 1 for GROS: A General Robust Aggregation Strategy

Figure 2 for GROS: A General Robust Aggregation Strategy

Figure 3 for GROS: A General Robust Aggregation Strategy

Figure 4 for GROS: A General Robust Aggregation Strategy

Abstract:A new, very general, robust procedure for combining estimators in metric spaces is introduced GROS. The method is reminiscent of the well-known median of means, as described in \cite{devroye2016sub}. Initially, the sample is divided into $K$ groups. Subsequently, an estimator is computed for each group. Finally, these $K$ estimators are combined using a robust procedure. We prove that this estimator is sub-Gaussian and we get its break-down point, in the sense of Donoho. The robust procedure involves a minimization problem on a general metric space, but we show that the same (up to a constant) sub-Gaussianity is obtained if the minimization is taken over the sample, making GROS feasible in practice. The performance of GROS is evaluated through five simulation studies: the first one focuses on classification using $k$-means, the second one on the multi-armed bandit problem, the third one on the regression problem. The fourth one is the set estimation problem under a noisy model. Lastly, we apply GROS to get a robust persistent diagram.

Via

Access Paper or Ask Questions

Regression with Missing Data, a Comparison Study of TechniquesBased on Random Forests

Oct 18, 2021

Irving Gómez-Méndez, Emilien Joly

Figure 1 for Regression with Missing Data, a Comparison Study of TechniquesBased on Random Forests

Figure 2 for Regression with Missing Data, a Comparison Study of TechniquesBased on Random Forests

Figure 3 for Regression with Missing Data, a Comparison Study of TechniquesBased on Random Forests

Figure 4 for Regression with Missing Data, a Comparison Study of TechniquesBased on Random Forests

Abstract:In this paper we present the practical benefits of a new random forest algorithm to deal withmissing values in the sample. The purpose of this work is to compare the different solutionsto deal with missing values with random forests and describe our new algorithm performanceas well as its algorithmic complexity. A variety of missing value mechanisms (such as MCAR,MAR, MNAR) are considered and simulated. We study the quadratic errors and the bias ofour algorithm and compare it to the most popular missing values random forests algorithms inthe literature. In particular, we compare those techniques for both a regression and predictionpurpose. This work follows a first paper Gomez-Mendez and Joly (2020) on the consistency ofthis new algorithm.

Via

Access Paper or Ask Questions