Abstract:As mental health chatbots proliferate to address the global treatment gap, a critical question emerges: How do we design for relational safety the quality of interaction patterns that unfold across conversations rather than the correctness of individual responses? Current safety evaluations assess single-turn crisis responses, missing the therapeutic dynamics that determine whether chatbots help or harm over time. We introduce TherapyProbe, a design probe methodology that generates actionable design knowledge by systematically exploring chatbot conversation trajectories through adversarial multi-agent simulation. Using open-source models, TherapyProbe surfaces relational safety failures interaction patterns like "validation spirals" where chatbots progressively reinforce hopelessness, or "empathy fatigue" where responses become mechanical over turns. Our contribution is translating these failures into a Safety Pattern Library of 23 failure archetypes with corresponding design recommendations. We contribute: (1) a replicable methodology requiring no API costs, (2) a clinically-grounded failure taxonomy, and (3) design implications for developers, clinicians, and policymakers.
Abstract:Large Language Models (LLMs) are increasingly used to ``professionalize'' workplace communication, often at the cost of linguistic identity. We introduce "Cultural Ghosting", the systematic erasure of linguistic markers unique to non-native English varieties during text processing. Through analysis of 22,350 LLM outputs generated from 1,490 culturally marked texts (Indian, Singaporean,& Nigerian English) processed by five models under three prompt conditions, we quantify this phenomenon using two novel metrics: Identity Erasure Rate (IER) & Semantic Preservation Score (SPS). Across all prompts, we find an overall IER of 10.26%, with model-level variation from 3.5% to 20.5% (5.9x range). Crucially, we identify a Semantic Preservation Paradox: models maintain high semantic similarity (mean SPS = 0.748) while systematically erasing cultural markers. Pragmatic markers (politeness conventions) are 1.9x more vulnerable than lexical markers (71.5% vs. 37.1% erasure). Our experiments demonstrate that explicit cultural-preservation prompts reduce erasure by 29% without sacrificing semantic quality.
Abstract:Adaptive learning systems optimize content delivery based on performance metrics but ignore the dynamic attention fluctuations that characterize neurodivergent learners. We present AttentionGuard, a framework that detects engagement-attention states from privacy-preserving behavioral signals and adapts interface elements accordingly. Our approach models four attention states derived from ADHD phenomenology and implements five novel UI adaptation patterns including bi-directional scaffolding that responds to both understimulation and overstimulation. We validate our detection model on the OULAD dataset, achieving 87.3% classification accuracy, and demonstrate correlation with clinical ADHD profiles through cross-validation on the HYPERAKTIV dataset. A Wizard-of-Oz study with 11 adults showing ADHD characteristics found significantly reduced cognitive load in the adaptive condition (NASA-TLX: 47.2 vs 62.8, Cohen's d=1.21, p=0.008) and improved comprehension (78.4% vs 61.2%, p=0.009). Concordance analysis showed 84% agreement between wizard decisions and automated classifier predictions, supporting deployment feasibility. The system is presented as an interactive demo where observers can inspect detected attention states, observe real-time UI adaptations, and compare automated decisions with human-in-the-loop overrides. We contribute empirically validated UI patterns for attention-adaptive interfaces and evidence that behavioral attention detection can meaningfully support neurodivergent learning experiences.
Abstract:Urban traffic management demands systems that simultaneously predict future conditions, detect anomalies, and take safe corrective actions -- all while providing reliability guarantees. We present STREAM-RL, a unified framework that introduces three novel algorithmic contributions: (1) PU-GAT+, an Uncertainty-Guided Adaptive Conformal Forecaster that uses prediction uncertainty to dynamically reweight graph attention via confidence-monotonic attention, achieving distribution-free coverage guarantees; (2) CRFN-BY, a Conformal Residual Flow Network that models uncertainty-normalized residuals via normalizing flows with Benjamini-Yekutieli FDR control under arbitrary dependence; and (3) LyCon-WRL+, an Uncertainty-Guided Safe World-Model RL agent with Lyapunov stability certificates, certified Lipschitz bounds, and uncertainty-propagated imagination rollouts. To our knowledge, this is the first framework to propagate calibrated uncertainty from forecasting through anomaly detection to safe policy learning with end-to-end theoretical guarantees. Experiments on multiple real-world traffic trajectory data demonstrate that STREAM-RL achieves 91.4\% coverage efficiency, controls FDR at 4.1\% under verified dependence, and improves safety rate to 95.2\% compared to 69\% for standard PPO while achieving higher reward, with 23ms end-to-end inference latency.
Abstract:Domestic AI agents faces ethical, autonomy, and inclusion challenges, particularly for overlooked groups like children, elderly, and Neurodivergent users. We present the Plural Voices Model (PVM), a novel single-agent framework that dynamically negotiates multi-user needs through real-time value alignment, leveraging diverse public datasets on mental health, eldercare, education, and moral reasoning. Using human+synthetic curriculum design with fairness-aware scenarios and ethical enhancements, PVM identifies core values, conflicts, and accessibility requirements to inform inclusive principles. Our privacy-focused prototype features adaptive safety scaffolds, tailored interactions (e.g., step-by-step guidance for Neurodivergent users, simple wording for children), and equitable conflict resolution. In preliminary evaluations, PVM outperforms multi-agent baselines in compliance (76% vs. 70%), fairness (90% vs. 85%), safety-violation rate (0% vs. 7%), and latency. Design innovations, including video guidance, autonomy sliders, family hubs, and adaptive safety dashboards, demonstrate new directions for ethical and inclusive domestic AI, for building user-centered agentic systems in plural domestic contexts. Our Codes and Model are been open sourced, available for reproduction: https://github.com/zade90/Agora
Abstract:Aura-CAPTCHA was developed as a multi-modal CAPTCHA system to address vulnerabilities in traditional methods that are increasingly bypassed by AI technologies, such as Optical Character Recognition (OCR) and adversarial image processing. The design integrated Generative Adversarial Networks (GANs) for generating dynamic image challenges, Reinforcement Learning (RL) for adaptive difficulty tuning, and Large Language Models (LLMs) for creating text and audio prompts. Visual challenges included 3x3 grid selections with at least three correct images, while audio challenges combined randomized numbers and words into a single task. RL adjusted difficulty based on incorrect attempts, response time, and suspicious user behavior. Evaluations on real-world traffic demonstrated a 92% human success rate and a 10% bot bypass rate, significantly outperforming existing CAPTCHA systems. The system provided a robust and scalable approach for securing online applications while remaining accessible to users, addressing gaps highlighted in previous research.




Abstract:Federated Learning (FL) has undergone significant development since its inception in 2016, advancing from basic algorithms to complex methodologies tailored to address diverse challenges and use cases. However, research and benchmarking of novel FL techniques against a plethora of established state-of-the-art solutions remain challenging. To streamline this process, we introduce FLsim, a comprehensive FL simulation framework designed to meet the diverse requirements of FL workflows in the literature. FLsim is characterized by its modularity, scalability, resource efficiency, and controlled reproducibility of experimental outcomes. Its easy to use interface allows users to specify customized FL requirements through job configuration, which supports: (a) customized data distributions, ranging from non-independent and identically distributed (non-iid) data to independent and identically distributed (iid) data, (b) selection of local learning algorithms according to user preferences, with complete agnosticism to ML libraries, (c) choice of network topology illustrating communication patterns among nodes, (d) definition of model aggregation and consensus algorithms, and (e) pluggable blockchain support for enhanced robustness. Through a series of experimental evaluations, we demonstrate the effectiveness and versatility of FLsim in simulating a diverse range of state-of-the-art FL experiments. We envisage that FLsim would mark a significant advancement in FL simulation frameworks, offering unprecedented flexibility and functionality for researchers and practitioners alike.
Abstract:In-network computation represents a transformative approach to addressing the escalating demands of Artificial Intelligence (AI) workloads on network infrastructure. By leveraging the processing capabilities of network devices such as switches, routers, and Network Interface Cards (NICs), this paradigm enables AI computations to be performed directly within the network fabric, significantly reducing latency, enhancing throughput, and optimizing resource utilization. This paper provides a comprehensive analysis of optimizing in-network computation for AI, exploring the evolution of programmable network architectures, such as Software-Defined Networking (SDN) and Programmable Data Planes (PDPs), and their convergence with AI. It examines methodologies for mapping AI models onto resource-constrained network devices, addressing challenges like limited memory and computational capabilities through efficient algorithm design and model compression techniques. The paper also highlights advancements in distributed learning, particularly in-network aggregation, and the potential of federated learning to enhance privacy and scalability. Frameworks like Planter and Quark are discussed for simplifying development, alongside key applications such as intelligent network monitoring, intrusion detection, traffic management, and Edge AI. Future research directions, including runtime programmability, standardized benchmarks, and new applications paradigms, are proposed to advance this rapidly evolving field. This survey underscores the potential of in-network AI to create intelligent, efficient, and responsive networks capable of meeting the demands of next-generation AI applications.




Abstract:Signed link prediction in graphs is an important problem that has applications in diverse domains. It is a binary classification problem that predicts whether an edge between a pair of nodes is positive or negative. Existing approaches for link prediction in unsigned networks cannot be directly applied for signed link prediction due to their inherent differences. Further, additional structural constraints, like, the structural balance property of the signed networks must be considered for signed link prediction. Recent signed link prediction approaches generate node representations using either generative models or discriminative models. Inspired by the recent success of Generative Adversarial Network (GAN) based models which comprises of a discriminator and generator in several applications, we propose a Generative Adversarial Network (GAN) based model for signed networks, SigGAN. It considers the requirements of signed networks, such as, integration of information from negative edges, high imbalance in number of positive and negative edges and structural balance theory. Comparing the performance with state of the art techniques on several real-world datasets validates the effectiveness of SigGAN.




Abstract:Predicting the popularity of news article is a challenging task. Existing literature mostly focused on article contents and polarity to predict popularity. However, existing research has not considered the users' preference towards a particular article. Understanding users' preference is an important aspect for predicting the popularity of news articles. Hence, we consider the social media data, from the Twitter platform, to address this research gap. In our proposed model, we have considered the users' involvement as well as the users' reaction towards an article to predict the popularity of the article. In short, we are predicting tomorrow's headline by probing today's Twitter discussion. We have considered 300 political news article from the New York Post, and our proposed approach has outperformed other baseline models.