Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chengfu Yang

Prompt-Calibrated SAM 3 for Open-Vocabulary Remote Sensing Semantic Segmentation

Jun 20, 2026

Yanghui Song, Nanqing Liu, Haonan Yin, Yingjie Gao, Chengfu Yang, Qi Ming

Abstract:Open-vocabulary semantic segmentation (OVSS) in remote sensing images aims to segment categories beyond a fixed label space. Recent SAM 3-based methods provide a promising training-free foundation, yet three key issues remain: (1) a single class-name prompt lacks sufficient semantic coverage for complex remote sensing categories; (2) expanding each category into multiple prompts introduces redundant online text encoding; and (3) directly aggregating multiple prompt responses propagates noisy activations into the final prediction. To address these issues, we propose ProC-SAM3, which calibrates SAM 3's prompt interface for remote sensing OVSS from three complementary aspects. First, we construct an offline prompt pool where a Category Matcher groups MLLM-generated candidates into per-category sets, and Expansion Constraints further refine each set using category-specific prior knowledge. Second, the resulting text embeddings are cached and reused across all test images, eliminating repeated text encoding. Third, we introduce Presence-Guided Residual Fusion to gate unreliable decoder outputs by prompt presence and confidence, followed by peak-preserving class aggregation that retains fine-grained activations for small and sparse objects. Experiments on eight benchmarks show that ProC-SAM3 achieves an average mIoU of 56.1%, outperforming the previous best training-free method by 3.9 percentage points. Code will be available at https://github.com/YanghuiSong/ProC-SAM3.

* 5 pages, 5 figures. This is the revised version of a manuscript currently under review for publication in IEEE Geoscience and Remote Sensing Letters (GRSL)

Via

Access Paper or Ask Questions

Enhancing Underwater Images via Deep Learning: A Comparative Study of VGG19 and ResNet50-Based Approaches

Aug 24, 2025

Aoqi Li, Yanghui Song, Jichao Dao, Chengfu Yang

Abstract:This paper addresses the challenging problem of image enhancement in complex underwater scenes by proposing a solution based on deep learning. The proposed method skillfully integrates two deep convolutional neural network models, VGG19 and ResNet50, leveraging their powerful feature extraction capabilities to perform multi-scale and multi-level deep feature analysis of underwater images. By constructing a unified model, the complementary advantages of the two models are effectively integrated, achieving a more comprehensive and accurate image enhancement effect.To objectively evaluate the enhancement effect, this paper introduces image quality assessment metrics such as PSNR, UCIQE, and UIQM to quantitatively compare images before and after enhancement and deeply analyzes the performance of different models in different scenarios.Furthermore, to improve the practicality and stability of the underwater visual enhancement system, this paper also provides practical suggestions from aspects such as model optimization, multi-model fusion, and hardware selection, aiming to provide strong technical support for visual enhancement tasks in complex underwater environments.

* 7 pages, 6 figures,2025 IEEE 3rd International Conference on Image Processing and Computer Applications (ICIPCA 2025)

Via

Access Paper or Ask Questions

DS_FusionNet: Dynamic Dual-Stream Fusion with Bidirectional Knowledge Distillation for Plant Disease Recognition

Apr 30, 2025

Yanghui Song, Chengfu Yang

Figure 1 for DS_FusionNet: Dynamic Dual-Stream Fusion with Bidirectional Knowledge Distillation for Plant Disease Recognition

Abstract:Given the severe challenges confronting the global growth security of economic crops, precise identification and prevention of plant diseases has emerged as a critical issue in artificial intelligence-enabled agricultural technology. To address the technical challenges in plant disease recognition, including small-sample learning, leaf occlusion, illumination variations, and high inter-class similarity, this study innovatively proposes a Dynamic Dual-Stream Fusion Network (DS_FusionNet). The network integrates a dual-backbone architecture, deformable dynamic fusion modules, and bidirectional knowledge distillation strategy, significantly enhancing recognition accuracy. Experimental results demonstrate that DS_FusionNet achieves classification accuracies exceeding 90% using only 10% of the PlantDisease and CIFAR-10 datasets, while maintaining 85% accuracy on the complex PlantWild dataset, exhibiting exceptional generalization capabilities. This research not only provides novel technical insights for fine-grained image classification but also establishes a robust foundation for precise identification and management of agricultural diseases.

* 9 pages, 14 figures, 2025 3rd International Conference on Algorithms, Mathematical Modeling and Machinery Processing (AMMMP 2025)

Via

Access Paper or Ask Questions