Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel S. Herman

Towards symbolic regression for interpretable clinical decision scores

Dec 08, 2025

Guilherme Seidyo Imai Aldeia, Joseph D. Romano, Fabricio Olivetti de Franca, Daniel S. Herman, William G. La Cava

Figure 1 for Towards symbolic regression for interpretable clinical decision scores

Figure 2 for Towards symbolic regression for interpretable clinical decision scores

Figure 3 for Towards symbolic regression for interpretable clinical decision scores

Figure 4 for Towards symbolic regression for interpretable clinical decision scores

Abstract:Medical decision-making makes frequent use of algorithms that combine risk equations with rules, providing clear and standardized treatment pathways. Symbolic regression (SR) traditionally limits its search space to continuous function forms and their parameters, making it difficult to model this decision-making. However, due to its ability to derive data-driven, interpretable models, SR holds promise for developing data-driven clinical risk scores. To that end we introduce Brush, an SR algorithm that combines decision-tree-like splitting algorithms with non-linear constant optimization, allowing for seamless integration of rule-based logic into symbolic regression and classification models. Brush achieves Pareto-optimal performance on SRBench, and was applied to recapitulate two widely used clinical scoring systems, achieving high accuracy and interpretable models. Compared to decision trees, random forests, and other SR methods, Brush achieves comparable or superior predictive performance while producing simpler models.

* 15 pages, 5 figures. Accepted for publication in Philosophical Transactions A. Autor Accepted Manuscript version

Via

Access Paper or Ask Questions

Iterative Learning of Computable Phenotypes for Treatment Resistant Hypertension using Large Language Models

Aug 07, 2025

Guilherme Seidyo Imai Aldeia, Daniel S. Herman, William G. La Cava

Abstract:Large language models (LLMs) have demonstrated remarkable capabilities for medical question answering and programming, but their potential for generating interpretable computable phenotypes (CPs) is under-explored. In this work, we investigate whether LLMs can generate accurate and concise CPs for six clinical phenotypes of varying complexity, which could be leveraged to enable scalable clinical decision support to improve care for patients with hypertension. In addition to evaluating zero-short performance, we propose and test a synthesize, execute, debug, instruct strategy that uses LLMs to generate and iteratively refine CPs using data-driven feedback. Our results show that LLMs, coupled with iterative learning, can generate interpretable and reasonably accurate programs that approach the performance of state-of-the-art ML methods while requiring significantly fewer training examples.

* To appear in PMLR, Volume 298, Machine Learning for Healthcare, 2025

Via

Access Paper or Ask Questions