Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Simon D. Nguyen

REALITrees: Rashomon Ensemble Active Learning for Interpretable Trees

Mar 24, 2026

Simon D. Nguyen, Hayden McTavish, Kentaro Hoffman, Cynthia Rudin, Tyler H. McCormick

Abstract:Active learning reduces labeling costs by selecting samples that maximize information gain. A dominant framework, Query-by-Committee (QBC), typically relies on perturbation-based diversity by inducing model disagreement through random feature subsetting or data blinding. While this approximates one notion of epistemic uncertainty, it sacrifices direct characterization of the plausible hypothesis space. We propose the complementary approach: Rashomon Ensembled Active Learning (REAL) which constructs a committee by exhaustively enumerating the Rashomon Set of all near-optimal models. To address functional redundancy within this set, we adopt a PAC-Bayesian framework using a Gibbs posterior to weight committee members by their empirical risk. Leveraging recent algorithmic advances, we exactly enumerate this set for the class of sparse decision trees. Across synthetic and established active learning baselines, REAL outperforms randomized ensembles, particularly in moderately noisy environments where it strategically leverages expanded model multiplicity to achieve faster convergence.

Via

Access Paper or Ask Questions

Adaptive Active Learning for Regression via Reinforcement Learning

Mar 11, 2026

Simon D. Nguyen, Troy Russo, Kentaro Hoffman, Tyler H. McCormick

Abstract:Active learning for regression reduces labeling costs by selecting the most informative samples. Improved Greedy Sampling is a prominent method that balances feature-space diversity and output-space uncertainty using a static, multiplicative rule. We propose Weighted improved Greedy Sampling (WiGS), which replaces this framework with a dynamic, additive criterion. We formulate weight selection as a reinforcement learning problem, enabling an agent to adapt the exploration-investigation balance throughout learning. Experiments on 18 benchmark datasets and a synthetic environment show WiGS outperforms iGS and other baseline methods in both accuracy and labeling efficiency, particularly in domains with irregular data density where the baseline's multiplicative rule ignores high-error samples in dense regions.

* 33 pages, 103 figures. Main paper (8 pages, 4 figures) plus appendix with proofs and supplemental experimental results. Submitted to UAI2026. Codebase available at https://github.com/thatswhatsimonsaid/WeightedGreedySampling

Via

Access Paper or Ask Questions