Abstract:Despite significant advancements in recent decades, autonomous vehicles (AVs) continue to face challenges in navigating certain traffic scenarios where human drivers excel. In such situations, AVs often become immobilized, disrupting overall traffic flow. Current recovery solutions, such as remote intervention (which is costly and inefficient) and manual takeover (which excludes non-drivers and limits AV accessibility), are inadequate. This paper introduces StuckSolver, a novel Large Language Model (LLM) driven recovery framework that enables AVs to resolve immobilization scenarios through self-reasoning and/or passenger-guided decision-making. StuckSolver is designed as a plug-in add-on module that operates on top of the AV's existing perception-planning-control stack, requiring no modification to its internal architecture. Instead, it interfaces with standard sensor data streams to detect immobilization states, interpret environmental context, and generate high-level recovery commands that can be executed by the AV's native planner. We evaluate StuckSolver on the Bench2Drive benchmark and in custom-designed uncertainty scenarios. Results show that StuckSolver achieves near-state-of-the-art performance through autonomous self-reasoning alone and exhibits further improvements when passenger guidance is incorporated.
Abstract:Despite rapid advances in autonomous driving, current autonomous vehicles (AVs) lack effective bidirectional communication with occupants, limiting personalization and recovery from immobilization. This reduces comfort and trust, potentially slowing broader AV adoption. We propose PACE-ADS (Psychology and Cognition Enabled Automated Driving Systems), a human-centered autonomy framework that enables AVs to sense, interpret, and respond to both external traffic and internal occupant states. PACE-ADS comprises three foundation model-based agents: a Driver Agent that analyzes the driving context, a Psychologist Agent that interprets occupant psychological signals (e.g., EEG, heart rate, facial expressions) and cognitive commands (e.g., speech), and a Coordinator Agent that integrates these inputs to produce high-level behavior decisions and operational parameters. Rather than replacing existing AV modules, PACE-ADS complements them by operating at the behavioral level, delegating low-level control to native AV systems. This separation enables closed-loop adaptation and supports integration across diverse platforms. We evaluate PACE-ADS in simulation across varied scenarios involving traffic lights, pedestrians, work zones, and car following. Results show that PACE-ADS adapts driving styles to occupant states, improves ride comfort, and enables safe recovery from immobilization via autonomous reasoning or human guidance. Our findings highlight the promise of LLM-based frameworks for bridging the gap between machine autonomy and human-centered driving.