Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shubham Kulkarni

All Required, In Order: Phase-Level Evaluation for AI-Human Dialogue in Healthcare and Beyond

Jan 13, 2026

Shubham Kulkarni, Alexander Lyzhov, Shiva Chaitanya, Preetam Joshi

Abstract:Conversational AI is starting to support real clinical work, but most evaluation methods miss how compliance depends on the full course of a conversation. We introduce Obligatory-Information Phase Structured Compliance Evaluation (OIP-SCE), an evaluation method that checks whether every required clinical obligation is met, in the right order, with clear evidence for clinicians to review. This makes complex rules practical and auditable, helping close the gap between technical progress and what healthcare actually needs. We demonstrate the method in two case studies (respiratory history, benefits verification) and show how phase-level evidence turns policy into shared, actionable steps. By giving clinicians control over what to check and engineers a clear specification to implement, OIP-SCE provides a single, auditable evaluation surface that aligns AI capability with clinical workflow and supports routine, safe use.

* Accepted at the AI for Medicine and Healthcare (AIMedHealth) Bridge Program, AAAI-26, Singapore. Full-length paper; to appear in Proceedings of Machine Learning Research (PMLR)

Via

Access Paper or Ask Questions

Towards Self-Adaptive Machine Learning-Enabled Systems Through QoS-Aware Model Switching

Aug 19, 2023

Shubham Kulkarni, Arya Marda, Karthik Vaidhyanathan

Abstract:Machine Learning (ML), particularly deep learning, has seen vast advancements, leading to the rise of Machine Learning-Enabled Systems (MLS). However, numerous software engineering challenges persist in propelling these MLS into production, largely due to various run-time uncertainties that impact the overall Quality of Service (QoS). These uncertainties emanate from ML models, software components, and environmental factors. Self-adaptation techniques present potential in managing run-time uncertainties, but their application in MLS remains largely unexplored. As a solution, we propose the concept of a Machine Learning Model Balancer, focusing on managing uncertainties related to ML models by using multiple models. Subsequently, we introduce AdaMLS, a novel self-adaptation approach that leverages this concept and extends the traditional MAPE-K loop for continuous MLS adaptation. AdaMLS employs lightweight unsupervised learning for dynamic model switching, thereby ensuring consistent QoS. Through a self-adaptive object detection system prototype, we demonstrate AdaMLS's effectiveness in balancing system and model performance. Preliminary results suggest AdaMLS surpasses naive and single state-of-the-art models in QoS guarantees, heralding the advancement towards self-adaptive MLS with optimal QoS in dynamic environments.

* Accepted in 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023) in New Ideas and Emerging Results (NIER) track

Via

Access Paper or Ask Questions