Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xunjie He

NeSy-CSA: A Neuro-Symbolic Framework for Open-Ended Critical Scenario Attribution

Jul 04, 2026

Qitong Chu, Xunjie He, Chen Deng, Huaxin Pei, Yufeng Yue

Abstract:Understanding why discovered scenarios become critical in scenario-based testing is essential for effectively leveraging them in decision-making systems. Reasoning about such criticality can be formulated as an attribution problem. However, across different decision-making tasks, the causes of criticality may involve diverse state variables, interaction patterns, and failure mechanisms, making attribution an inherently open-ended problem beyond predefined explanation spaces. Existing attribution methods still struggle to balance open-ended reasoning flexibility with the interpretability and traceability required for critical scenario reasoning. To address this limitation, we propose NeSy-CSA, a neuro-symbolic framework that transforms open-ended critical scenario attribution from unconstrained explanation generation into structured and traceable reasoning. NeSy-CSA narrows the attribution space by selecting relevant factors, makes the reasoning process traceable through a dependency-aware evidence graph, and executes symbolic reasoning procedures derived from atomic operations, coordinated with evidence-constrained neural inference to support flexible open-ended attribution. We further introduce a process-level and result-level assessment module to evaluate the structural validity of the attribution process and the behavioral effectiveness of the attribution results under controlled interventions. Experiments across four decision-making environments show that NeSy-CSA improves two intervention-based measures of attribution effectiveness by 18.32% and 13.67% over LLM-based baselines. These results demonstrate its potential to transform discovered critical scenarios into reusable knowledge for subsequent testing and safety analysis.

Via

Access Paper or Ask Questions

DINO-CoDT: Multi-class Collaborative Detection and Tracking with Vision Foundation Models

Jun 09, 2025

Xunjie He, Christina Dao Wen Lee, Meiling Wang, Chengran Yuan, Zefan Huang, Yufeng Yue, Marcelo H. Ang Jr

Abstract:Collaborative perception plays a crucial role in enhancing environmental understanding by expanding the perceptual range and improving robustness against sensor failures, which primarily involves collaborative 3D detection and tracking tasks. The former focuses on object recognition in individual frames, while the latter captures continuous instance tracklets over time. However, existing works in both areas predominantly focus on the vehicle superclass, lacking effective solutions for both multi-class collaborative detection and tracking. This limitation hinders their applicability in real-world scenarios, which involve diverse object classes with varying appearances and motion patterns. To overcome these limitations, we propose a multi-class collaborative detection and tracking framework tailored for diverse road users. We first present a detector with a global spatial attention fusion (GSAF) module, enhancing multi-scale feature learning for objects of varying sizes. Next, we introduce a tracklet RE-IDentification (REID) module that leverages visual semantics with a vision foundation model to effectively reduce ID SWitch (IDSW) errors, in cases of erroneous mismatches involving small objects like pedestrians. We further design a velocity-based adaptive tracklet management (VATM) module that adjusts the tracking interval dynamically based on object motion. Extensive experiments on the V2X-Real and OPV2V datasets show that our approach significantly outperforms existing state-of-the-art methods in both detection and tracking accuracy.

Via

Access Paper or Ask Questions

Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments

Oct 09, 2024

Meng Yu, Luojie Yang, Xunjie He, Yi Yang, Yufeng Yue

Figure 1 for Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments

Figure 2 for Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments

Figure 3 for Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments

Figure 4 for Open-RGBT: Open-vocabulary RGB-T Zero-shot Semantic Segmentation in Open-world Environments

Abstract:Semantic segmentation is a critical technique for effective scene understanding. Traditional RGB-T semantic segmentation models often struggle to generalize across diverse scenarios due to their reliance on pretrained models and predefined categories. Recent advancements in Visual Language Models (VLMs) have facilitated a shift from closed-set to open-vocabulary semantic segmentation methods. However, these models face challenges in dealing with intricate scenes, primarily due to the heterogeneity between RGB and thermal modalities. To address this gap, we present Open-RGBT, a novel open-vocabulary RGB-T semantic segmentation model. Specifically, we obtain instance-level detection proposals by incorporating visual prompts to enhance category understanding. Additionally, we employ the CLIP model to assess image-text similarity, which helps correct semantic consistency and mitigates ambiguities in category identification. Empirical evaluations demonstrate that Open-RGBT achieves superior performance in diverse and challenging real-world scenarios, even in the wild, significantly advancing the field of RGB-T semantic segmentation.

Via

Access Paper or Ask Questions