Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sridhar Krishnan

Estimating Quality in Therapeutic Conversations: A Multi-Dimensional Natural Language Processing Framework

May 09, 2025

Alice Rueda, Argyrios Perivolaris, Niloy Roy, Dylan Weston, Sarmed Shaya, Zachary Cote, Martin Ivanov, Bazen G. Teferra, Yuqi Wu, Sirisha Rambhatla(+8 more)

Figure 1 for Estimating Quality in Therapeutic Conversations: A Multi-Dimensional Natural Language Processing Framework

Figure 2 for Estimating Quality in Therapeutic Conversations: A Multi-Dimensional Natural Language Processing Framework

Figure 3 for Estimating Quality in Therapeutic Conversations: A Multi-Dimensional Natural Language Processing Framework

Figure 4 for Estimating Quality in Therapeutic Conversations: A Multi-Dimensional Natural Language Processing Framework

Abstract:Engagement between client and therapist is a critical determinant of therapeutic success. We propose a multi-dimensional natural language processing (NLP) framework that objectively classifies engagement quality in counseling sessions based on textual transcripts. Using 253 motivational interviewing transcripts (150 high-quality, 103 low-quality), we extracted 42 features across four domains: conversational dynamics, semantic similarity as topic alignment, sentiment classification, and question detection. Classifiers, including Random Forest (RF), Cat-Boost, and Support Vector Machines (SVM), were hyperparameter tuned and trained using a stratified 5-fold cross-validation and evaluated on a holdout test set. On balanced (non-augmented) data, RF achieved the highest classification accuracy (76.7%), and SVM achieved the highest AUC (85.4%). After SMOTE-Tomek augmentation, performance improved significantly: RF achieved up to 88.9% accuracy, 90.0% F1-score, and 94.6% AUC, while SVM reached 81.1% accuracy, 83.1% F1-score, and 93.6% AUC. The augmented data results reflect the potential of the framework in future larger-scale applications. Feature contribution revealed conversational dynamics and semantic similarity between clients and therapists were among the top contributors, led by words uttered by the client (mean and standard deviation). The framework was robust across the original and augmented datasets and demonstrated consistent improvements in F1 scores and recall. While currently text-based, the framework supports future multimodal extensions (e.g., vocal tone, facial affect) for more holistic assessments. This work introduces a scalable, data-driven method for evaluating engagement quality of the therapy session, offering clinicians real-time feedback to enhance the quality of both virtual and in-person therapeutic interactions.

* 12 pages, 4 figures, 7 tables

Via

Access Paper or Ask Questions

Multi-level Stress Assessment from ECG in a Virtual Reality Environment using Multimodal Fusion

Jul 09, 2021

Zeeshan Ahmad, Suha Rabbani, Muhammad Rehman Zafar, Syem Ishaque, Sridhar Krishnan, Naimul Khan

Figure 1 for Multi-level Stress Assessment from ECG in a Virtual Reality Environment using Multimodal Fusion

Figure 2 for Multi-level Stress Assessment from ECG in a Virtual Reality Environment using Multimodal Fusion

Figure 3 for Multi-level Stress Assessment from ECG in a Virtual Reality Environment using Multimodal Fusion

Figure 4 for Multi-level Stress Assessment from ECG in a Virtual Reality Environment using Multimodal Fusion

Abstract:ECG is an attractive option to assess stress in serious Virtual Reality (VR) applications due to its non-invasive nature. However, the existing Machine Learning (ML) models perform poorly. Moreover, existing studies only perform a binary stress assessment, while to develop a more engaging biofeedback-based application, multi-level assessment is necessary. Existing studies annotate and classify a single experience (e.g. watching a VR video) to a single stress level, which again prevents design of dynamic experiences where real-time in-game stress assessment can be utilized. In this paper, we report our findings on a new study on VR stress assessment, where three stress levels are assessed. ECG data was collected from 9 users experiencing a VR roller coaster. The VR experience was then manually labeled in 10-seconds segments to three stress levels by three raters. We then propose a novel multimodal deep fusion model utilizing spectrogram and 1D ECG that can provide a stress prediction from just a 1-second window. Experimental results demonstrate that the proposed model outperforms the classical HRV-based ML models (9% increase in accuracy) and baseline deep learning models (2.5% increase in accuracy). We also report results on the benchmark WESAD dataset to show the supremacy of the model.

* Under review

Via

Access Paper or Ask Questions