Abstract:Medical ultrasonography is an essential imaging technique for examining superficial organs and tissues, including lymph nodes, breast, and thyroid. It employs high-frequency ultrasound waves to generate detailed images of the internal structures of the human body. However, manually contouring regions of interest in these images is a labor-intensive task that demands expertise and often results in inconsistent interpretations among individuals. Vision-language foundation models, which have excelled in various computer vision applications, present new opportunities for enhancing ultrasound image analysis. Yet, their performance is hindered by the significant differences between natural and medical imaging domains. This research seeks to overcome these challenges by developing domain adaptation methods for vision-language foundation models. In this study, we explore the fine-tuning pipeline for vision-language foundation models by utilizing large language model as text refiner with special-designed adaptation strategies and task-driven heads. Our approach has been extensively evaluated on six ultrasound datasets and two tasks: segmentation and classification. The experimental results show that our method can effectively improve the performance of vision-language foundation models for ultrasound image analysis, and outperform the existing state-of-the-art vision-language and pure foundation models. The source code of this study is available at https://github.com/jinggqu/NextGen-UIA.
Abstract:Because of the limitations of the infrared imaging principle and the properties of infrared imaging systems, infrared images have some drawbacks, including a lack of details, indistinct edges, and a large amount of salt-andpepper noise. To improve the sparse characteristics of the image while maintaining the image edges and weakening staircase artifacts, this paper proposes a method that uses the Lp quasinorm instead of the L1 norm and for infrared image deblurring with an overlapping group sparse total variation method. The Lp quasinorm introduces another degree of freedom, better describes image sparsity characteristics, and improves image restoration. Furthermore, we adopt the accelerated alternating direction method of multipliers and fast Fourier transform theory in the proposed method to improve the efficiency and robustness of our algorithm. Experiments show that under different conditions for blur and salt-and-pepper noise, the proposed method leads to excellent performance in terms of objective evaluation and subjective visual results.