Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ming Deng

DASH: Dialogue-Aware Similarity and Handshake Recognition for Topic Segmentation in Public-Channel Conversations

Dec 17, 2025

Sijin Sun, Liangbin Zhao, Ming Deng, Xiuju Fu

Abstract:Dialogue Topic Segmentation (DTS) is crucial for understanding task-oriented public-channel communications, such as maritime VHF dialogues, which feature informal speech and implicit transitions. To address the limitations of traditional methods, we propose DASH-DTS, a novel LLM-based framework. Its core contributions are: (1) topic shift detection via dialogue handshake recognition; (2) contextual enhancement through similarity-guided example selection; and (3) the generation of selective positive and negative samples to improve model discrimination and robustness. Additionally, we release VHF-Dial, the first public dataset of real-world maritime VHF communications, to advance research in this domain. DASH-DTS provides interpretable reasoning and confidence scores for each segment. Experimental results demonstrate that our framework achieves several sota segmentation trusted accuracy on both VHF-Dial and standard benchmarks, establishing a strong foundation for stable monitoring and decision support in operational dialogues.

* Accepted by AAAIW2026

Via

Access Paper or Ask Questions

VTS-LLM: Domain-Adaptive LLM Agent for Enhancing Awareness in Vessel Traffic Services through Natural Language

May 02, 2025

Sijin Sun, Liangbin Zhao, Ming Deng, Xiuju Fu

Figure 1 for VTS-LLM: Domain-Adaptive LLM Agent for Enhancing Awareness in Vessel Traffic Services through Natural Language

Figure 2 for VTS-LLM: Domain-Adaptive LLM Agent for Enhancing Awareness in Vessel Traffic Services through Natural Language

Figure 3 for VTS-LLM: Domain-Adaptive LLM Agent for Enhancing Awareness in Vessel Traffic Services through Natural Language

Figure 4 for VTS-LLM: Domain-Adaptive LLM Agent for Enhancing Awareness in Vessel Traffic Services through Natural Language

Abstract:Vessel Traffic Services (VTS) are essential for maritime safety and regulatory compliance through real-time traffic management. However, with increasing traffic complexity and the prevalence of heterogeneous, multimodal data, existing VTS systems face limitations in spatiotemporal reasoning and intuitive human interaction. In this work, we propose VTS-LLM Agent, the first domain-adaptive large LLM agent tailored for interactive decision support in VTS operations. We formalize risk-prone vessel identification as a knowledge-augmented Text-to-SQL task, combining structured vessel databases with external maritime knowledge. To support this, we construct a curated benchmark dataset consisting of a custom schema, domain-specific corpus, and a query-SQL test set in multiple linguistic styles. Our framework incorporates NER-based relational reasoning, agent-based domain knowledge injection, semantic algebra intermediate representation, and query rethink mechanisms to enhance domain grounding and context-aware understanding. Experimental results show that VTS-LLM outperforms both general-purpose and SQL-focused baselines under command-style, operational-style, and formal natural language queries, respectively. Moreover, our analysis provides the first empirical evidence that linguistic style variation introduces systematic performance challenges in Text-to-SQL modeling. This work lays the foundation for natural language interfaces in vessel traffic services and opens new opportunities for proactive, LLM-driven maritime real-time traffic management.

* 8 pages, 5 figures, 7 tablels, submitted to ITSC2025

Via

Access Paper or Ask Questions

FMNet: Frequency-Assisted Mamba-Like Linear Attention Network for Camouflaged Object Detection

Mar 14, 2025

Ming Deng, Sijin Sun, Zihao Li, Xiaochuan Hu, Xing Wu

Abstract:Camouflaged Object Detection (COD) is challenging due to the strong similarity between camouflaged objects and their surroundings, which complicates identification. Existing methods mainly rely on spatial local features, failing to capture global information, while Transformers increase computational costs.To address this, the Frequency-Assisted Mamba-Like Linear Attention Network (FMNet) is proposed, which leverages frequency-domain learning to efficiently capture global features and mitigate ambiguity between objects and the background. FMNet introduces the Multi-Scale Frequency-Assisted Mamba-Like Linear Attention (MFM) module, integrating frequency and spatial features through a multi-scale structure to handle scale variations while reducing computational complexity. Additionally, the Pyramidal Frequency Attention Extraction (PFAE) module and the Frequency Reverse Decoder (FRD) enhance semantics and reconstruct features. Experimental results demonstrate that FMNet outperforms existing methods on multiple COD datasets, showcasing its advantages in both performance and efficiency. Code available at https://anonymous.4open.science/r/FMNet-3CE5.

Via

Access Paper or Ask Questions

Self-Adaptive Gamma Context-Aware SSM-based Model for Metal Defect Detection

Mar 06, 2025

Sijin Sun, Ming Deng, Xingrui Yu, Xinyu Xi, Liangbin Zhao

Abstract:Metal defect detection is critical in industrial quality assurance, yet existing methods struggle with grayscale variations and complex defect states, limiting its robustness. To address these challenges, this paper proposes a Self-Adaptive Gamma Context-Aware SSM-based model(GCM-DET). This advanced detection framework integrating a Dynamic Gamma Correction (GC) module to enhance grayscale representation and optimize feature extraction for precise defect reconstruction. A State-Space Search Management (SSM) architecture captures robust multi-scale features, effectively handling defects of varying shapes and scales. Focal Loss is employed to mitigate class imbalance and refine detection accuracy. Additionally, the CD5-DET dataset is introduced, specifically designed for port container maintenance, featuring significant grayscale variations and intricate defect patterns. Experimental results demonstrate that the proposed model achieves substantial improvements, with mAP@0.5 gains of 27.6\%, 6.6\%, and 2.6\% on the CD5-DET, NEU-DET, and GC10-DET datasets.

* 19 pages, 9 figures, under review

Via

Access Paper or Ask Questions

HREB-CRF: Hierarchical Reduced-bias EMA for Chinese Named Entity Recognition

Mar 03, 2025

Sijin Sun, Ming Deng, Xinrui Yu, Liangbin Zhao

Abstract:Incorrect boundary division, complex semantic representation, and differences in pronunciation and meaning often lead to errors in Chinese Named Entity Recognition(CNER). To address these issues, this paper proposes HREB-CRF framework: Hierarchical Reduced-bias EMA with CRF. The proposed method amplifies word boundaries and pools long text gradients through exponentially fixed-bias weighted average of local and global hierarchical attention. Experimental results on the MSRA, Resume, and Weibo datasets show excellent in F1, outperforming the baseline model by 1.1\%, 1.6\%, and 9.8\%. The significant improvement in F1 shows evidences of strong effectiveness and robustness of approach in CNER tasks.

* 18 pages, 10 figures, under Review

Via

Access Paper or Ask Questions