Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


Differentially Private Median Forests for Regression and Classification

Jun 15, 2020
Shorya Consul, Sinead Williamson


Share this with someone who'll enjoy it:


Random forests are a popular method for classification and regression due to their versatility. However, this flexibility can come at the cost of user privacy, since training random forests requires multiple data queries, often on small, identifiable subsets of the training data. Differentially private approaches based on extremely random trees reduce the number of queries, but can lead to low-occupancy leaf nodes which require the addition of large amounts of noise. In this paper, we propose DiPriMe forests, a novel tree-based ensemble method for regression and classification problems, that ensures differential privacy while maintaining high utility. We construct trees based on a privatized version of the median value of attributes, obtained via the exponential mechanism. The use of the noisy median encourages balanced leaf nodes, ensuring that the noise added to the parameter estimate at each leaf is not overly large. The resulting algorithm, which is appropriate for real or categorical covariates, exhibits high utility while ensuring differential privacy.

* Submitted to NeurIPS 2020 


   Access Paper Source



Share this with someone who'll enjoy it: