Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Image": models, code, and papers

Virtual histological staining of unlabeled autopsy tissue

Aug 02, 2023
Yuzhu Li, Nir Pillar, Jingxi Li, Tairan Liu, Di Wu, Songyu Sun, Guangdong Ma, Kevin de Haan, Luzhe Huang, Sepehr Hamidi, Anatoly Urisman, Tal Keidar Haran, William Dean Wallace, Jonathan E. Zuckerman, Aydogan Ozcan

Figure 1 for Virtual histological staining of unlabeled autopsy tissue

Figure 2 for Virtual histological staining of unlabeled autopsy tissue

Figure 3 for Virtual histological staining of unlabeled autopsy tissue

Figure 4 for Virtual histological staining of unlabeled autopsy tissue

Histological examination is a crucial step in an autopsy; however, the traditional histochemical staining of post-mortem samples faces multiple challenges, including the inferior staining quality due to autolysis caused by delayed fixation of cadaver tissue, as well as the resource-intensive nature of chemical staining procedures covering large tissue areas, which demand substantial labor, cost, and time. These challenges can become more pronounced during global health crises when the availability of histopathology services is limited, resulting in further delays in tissue fixation and more severe staining artifacts. Here, we report the first demonstration of virtual staining of autopsy tissue and show that a trained neural network can rapidly transform autofluorescence images of label-free autopsy tissue sections into brightfield equivalent images that match hematoxylin and eosin (H&E) stained versions of the same samples, eliminating autolysis-induced severe staining artifacts inherent in traditional histochemical staining of autopsied tissue. Our virtual H&E model was trained using >0.7 TB of image data and a data-efficient collaboration scheme that integrates the virtual staining network with an image registration network. The trained model effectively accentuated nuclear, cytoplasmic and extracellular features in new autopsy tissue samples that experienced severe autolysis, such as COVID-19 samples never seen before, where the traditional histochemical staining failed to provide consistent staining quality. This virtual autopsy staining technique can also be extended to necrotic tissue, and can rapidly and cost-effectively generate artifact-free H&E stains despite severe autolysis and cell death, also reducing labor, cost and infrastructure requirements associated with the standard histochemical staining.

* 24 Pages, 7 Figures

Via

Access Paper or Ask Questions

Revisiting DETR Pre-training for Object Detection

Aug 02, 2023
Yan Ma, Weicong Liang, Yiduo Hao, Bohan Chen, Xiangyu Yue, Chao Zhang, Yuhui Yuan

Figure 1 for Revisiting DETR Pre-training for Object Detection

Figure 2 for Revisiting DETR Pre-training for Object Detection

Figure 3 for Revisiting DETR Pre-training for Object Detection

Figure 4 for Revisiting DETR Pre-training for Object Detection

Motivated by that DETR-based approaches have established new records on COCO detection and segmentation benchmarks, many recent endeavors show increasing interest in how to further improve DETR-based approaches by pre-training the Transformer in a self-supervised manner while keeping the backbone frozen. Some studies already claimed significant improvements in accuracy. In this paper, we take a closer look at their experimental methodology and check if their approaches are still effective on the very recent state-of-the-art such as $\mathcal{H}$-Deformable-DETR. We conduct thorough experiments on COCO object detection tasks to study the influence of the choice of pre-training datasets, localization, and classification target generation schemes. Unfortunately, we find the previous representative self-supervised approach such as DETReg, fails to boost the performance of the strong DETR-based approaches on full data regimes. We further analyze the reasons and find that simply combining a more accurate box predictor and Objects$365$ benchmark can significantly improve the results in follow-up experiments. We demonstrate the effectiveness of our approach by achieving strong object detection results of AP=$59.3\%$ on COCO val set, which surpasses $\mathcal{H}$-Deformable-DETR + Swin-L by +$1.4\%$. Last, we generate a series of synthetic pre-training datasets by combining the very recent image-to-text captioning models (LLaVA) and text-to-image generative models (SDXL). Notably, pre-training on these synthetic datasets leads to notable improvements in object detection performance. Looking ahead, we anticipate substantial advantages through the future expansion of the synthetic pre-training dataset.

Via

Access Paper or Ask Questions

Compositional Feature Augmentation for Unbiased Scene Graph Generation

Aug 13, 2023
Lin Li, Guikun Chen, Jun Xiao, Yi Yang, Chunping Wang, Long Chen

Figure 1 for Compositional Feature Augmentation for Unbiased Scene Graph Generation

Figure 2 for Compositional Feature Augmentation for Unbiased Scene Graph Generation

Figure 3 for Compositional Feature Augmentation for Unbiased Scene Graph Generation

Figure 4 for Compositional Feature Augmentation for Unbiased Scene Graph Generation

Scene Graph Generation (SGG) aims to detect all the visual relation triplets <sub, pred, obj> in a given image. With the emergence of various advanced techniques for better utilizing both the intrinsic and extrinsic information in each relation triplet, SGG has achieved great progress over the recent years. However, due to the ubiquitous long-tailed predicate distributions, today's SGG models are still easily biased to the head predicates. Currently, the most prevalent debiasing solutions for SGG are re-balancing methods, e.g., changing the distributions of original training samples. In this paper, we argue that all existing re-balancing strategies fail to increase the diversity of the relation triplet features of each predicate, which is critical for robust SGG. To this end, we propose a novel Compositional Feature Augmentation (CFA) strategy, which is the first unbiased SGG work to mitigate the bias issue from the perspective of increasing the diversity of triplet features. Specifically, we first decompose each relation triplet feature into two components: intrinsic feature and extrinsic feature, which correspond to the intrinsic characteristics and extrinsic contexts of a relation triplet, respectively. Then, we design two different feature augmentation modules to enrich the feature diversity of original relation triplets by replacing or mixing up either their intrinsic or extrinsic features from other samples. Due to its model-agnostic nature, CFA can be seamlessly incorporated into various SGG frameworks. Extensive ablations have shown that CFA achieves a new state-of-the-art performance on the trade-off between different metrics.

* Accepted by ICCV 2023

Via

Access Paper or Ask Questions

Point Anywhere: Directed Object Estimation from Omnidirectional Images

Aug 02, 2023
Nanami Kotani, Asako Kanezaki

One of the intuitive instruction methods in robot navigation is a pointing gesture. In this study, we propose a method using an omnidirectional camera to eliminate the user/object position constraint and the left/right constraint of the pointing arm. Although the accuracy of skeleton and object detection is low due to the high distortion of equirectangular images, the proposed method enables highly accurate estimation by repeatedly extracting regions of interest from the equirectangular image and projecting them onto perspective images. Furthermore, we found that training the likelihood of the target object in machine learning further improves the estimation accuracy.

* Accepted to SIGGRAPH 2023 Poster. Project page: https://github.com/NKotani/PointAnywhere

Via

Access Paper or Ask Questions

A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

Jul 28, 2023
Carlo Aironi, Samuele Cornell, Luca Serafini, Stefano Squartini

Figure 1 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

Figure 2 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

Figure 3 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

Figure 4 for A Time-Frequency Generative Adversarial based method for Audio Packet Loss Concealment

Packet loss is a major cause of voice quality degradation in VoIP transmissions with serious impact on intelligibility and user experience. This paper describes a system based on a generative adversarial approach, which aims to repair the lost fragments during the transmission of audio streams. Inspired by the powerful image-to-image translation capability of Generative Adversarial Networks (GANs), we propose bin2bin, an improved pix2pix framework to achieve the translation task from magnitude spectrograms of audio frames with lost packets, to noncorrupted speech spectrograms. In order to better maintain the structural information after spectrogram translation, this paper introduces the combination of two STFT-based loss functions, mixed with the traditional GAN objective. Furthermore, we employ a modified PatchGAN structure as discriminator and we lower the concealment time by a proper initialization of the phase reconstruction algorithm. Experimental results show that the proposed method has obvious advantages when compared with the current state-of-the-art methods, as it can better handle both high packet loss rates and large gaps.

* Accepted at EUSIPCO - 31st European Signal Processing Conference, 2023

Via

Access Paper or Ask Questions

Improving Mass Detection in Mammography Images: A Study of Weakly Supervised Learning and Class Activation Map Methods

Aug 07, 2023
Vicente Sampaio, Filipe R. Cordeiro

In recent years, weakly supervised models have aided in mass detection using mammography images, decreasing the need for pixel-level annotations. However, most existing models in the literature rely on Class Activation Maps (CAM) as the activation method, overlooking the potential benefits of exploring other activation techniques. This work presents a study that explores and compares different activation maps in conjunction with state-of-the-art methods for weakly supervised training in mammography images. Specifically, we investigate CAM, GradCAM, GradCAM++, XGradCAM, and LayerCAM methods within the framework of the GMIC model for mass detection in mammography images. The evaluation is conducted on the VinDr-Mammo dataset, utilizing the metrics Accuracy, True Positive Rate (TPR), False Negative Rate (FNR), and False Positive Per Image (FPPI). Results show that using different strategies of activation maps during training and test stages leads to an improvement of the model. With this strategy, we improve the results of the GMIC method, decreasing the FPPI value and increasing TPR.

* Accepted for publication at SIBGRAPI 20203

Via

Access Paper or Ask Questions

Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising

Aug 07, 2023
Xin Jin, Jia-Wen Xiao, Ling-Hao Han, Chunle Guo, Ruixun Zhang, Xialei Liu, Chongyi Li

Figure 1 for Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising

Figure 2 for Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising

Figure 3 for Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising

Figure 4 for Lighting Every Darkness in Two Pairs: A Calibration-Free Pipeline for RAW Denoising

Calibration-based methods have dominated RAW image denoising under extremely low-light environments. However, these methods suffer from several main deficiencies: 1) the calibration procedure is laborious and time-consuming, 2) denoisers for different cameras are difficult to transfer, and 3) the discrepancy between synthetic noise and real noise is enlarged by high digital gain. To overcome the above shortcomings, we propose a calibration-free pipeline for Lighting Every Drakness (LED), regardless of the digital gain or camera sensor. Instead of calibrating the noise parameters and training repeatedly, our method could adapt to a target camera only with few-shot paired data and fine-tuning. In addition, well-designed structural modification during both stages alleviates the domain gap between synthetic and real noise without any extra computational cost. With 2 pairs for each additional digital gain (in total 6 pairs) and 0.5% iterations, our method achieves superior performance over other calibration-based methods. Our code is available at https://github.com/Srameo/LED .

* Accepted to ICCV2023

Via

Access Paper or Ask Questions

SHISRCNet: Super-resolution And Classification Network For Low-resolution Breast Cancer Histopathology Image

Jun 25, 2023
Luyuan Xie, Cong Li, Zirui Wang, Xin Zhang, Boyan Chen, Qingni Shen, Zhonghai Wu

Figure 1 for SHISRCNet: Super-resolution And Classification Network For Low-resolution Breast Cancer Histopathology Image

Figure 2 for SHISRCNet: Super-resolution And Classification Network For Low-resolution Breast Cancer Histopathology Image

Figure 3 for SHISRCNet: Super-resolution And Classification Network For Low-resolution Breast Cancer Histopathology Image

Figure 4 for SHISRCNet: Super-resolution And Classification Network For Low-resolution Breast Cancer Histopathology Image

The rapid identification and accurate diagnosis of breast cancer, known as the killer of women, have become greatly significant for those patients. Numerous breast cancer histopathological image classification methods have been proposed. But they still suffer from two problems. (1) These methods can only hand high-resolution (HR) images. However, the low-resolution (LR) images are often collected by the digital slide scanner with limited hardware conditions. Compared with HR images, LR images often lose some key features like texture, which deeply affects the accuracy of diagnosis. (2) The existing methods have fixed receptive fields, so they can not extract and fuse multi-scale features well for images with different magnification factors. To fill these gaps, we present a \textbf{S}ingle \textbf{H}istopathological \textbf{I}mage \textbf{S}uper-\textbf{R}esolution \textbf{C}lassification network (SHISRCNet), which consists of two modules: Super-Resolution (SR) and Classification (CF) modules. SR module reconstructs LR images into SR ones. CF module extracts and fuses the multi-scale features of SR images for classification. In the training stage, we introduce HR images into the CF module to enhance SHISRCNet's performance. Finally, through the joint training of these two modules, super-resolution and classified of LR images are integrated into our model. The experimental results demonstrate that the effects of our method are close to the SOTA methods with taking HR images as inputs.

* Accepted by MICCAI 2023

Via

Access Paper or Ask Questions

DermoSegDiff: A Boundary-aware Segmentation Diffusion Model for Skin Lesion Delineation

Aug 05, 2023
Afshin Bozorgpour, Yousef Sadegheih, Amirhossein Kazerouni, Reza Azad, Dorit Merhof

Figure 1 for DermoSegDiff: A Boundary-aware Segmentation Diffusion Model for Skin Lesion Delineation

Figure 2 for DermoSegDiff: A Boundary-aware Segmentation Diffusion Model for Skin Lesion Delineation

Figure 3 for DermoSegDiff: A Boundary-aware Segmentation Diffusion Model for Skin Lesion Delineation

Figure 4 for DermoSegDiff: A Boundary-aware Segmentation Diffusion Model for Skin Lesion Delineation

Skin lesion segmentation plays a critical role in the early detection and accurate diagnosis of dermatological conditions. Denoising Diffusion Probabilistic Models (DDPMs) have recently gained attention for their exceptional image-generation capabilities. Building on these advancements, we propose DermoSegDiff, a novel framework for skin lesion segmentation that incorporates boundary information during the learning process. Our approach introduces a novel loss function that prioritizes the boundaries during training, gradually reducing the significance of other regions. We also introduce a novel U-Net-based denoising network that proficiently integrates noise and semantic information inside the network. Experimental results on multiple skin segmentation datasets demonstrate the superiority of DermoSegDiff over existing CNN, transformer, and diffusion-based approaches, showcasing its effectiveness and generalization in various scenarios. The implementation is publicly accessible on \href{https://github.com/mindflow-institue/dermosegdiff}{GitHub}

* MICCAI workshop PRIME

Via

Access Paper or Ask Questions

Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

Aug 12, 2023
Zhichao Lu, Chuntao Ding, Shangguang Wang, Ran Cheng, Felix Juefei-Xu, Vishnu Naresh Boddeti

Figure 1 for Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

Figure 2 for Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

Figure 3 for Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

Figure 4 for Seed Feature Maps-based CNN Models for LEO Satellite Remote Sensing Services

Deploying high-performance convolutional neural network (CNN) models on low-earth orbit (LEO) satellites for rapid remote sensing image processing has attracted significant interest from industry and academia. However, the limited resources available on LEO satellites contrast with the demands of resource-intensive CNN models, necessitating the adoption of ground-station server assistance for training and updating these models. Existing approaches often require large floating-point operations (FLOPs) and substantial model parameter transmissions, presenting considerable challenges. To address these issues, this paper introduces a ground-station server-assisted framework. With the proposed framework, each layer of the CNN model contains only one learnable feature map (called the seed feature map) from which other feature maps are generated based on specific rules. The hyperparameters of these rules are randomly generated instead of being trained, thus enabling the generation of multiple feature maps from the seed feature map and significantly reducing FLOPs. Furthermore, since the random hyperparameters can be saved using a few random seeds, the ground station server assistance can be facilitated in updating the CNN model deployed on the LEO satellite. Experimental results on the ISPRS Vaihingen, ISPRS Potsdam, UAVid, and LoveDA datasets for semantic segmentation services demonstrate that the proposed framework outperforms existing state-of-the-art approaches. In particular, the SineFM-based model achieves a higher mIoU than the UNetFormer on the UAVid dataset, with 3.3x fewer parameters and 2.2x fewer FLOPs.

* 11 pages

Via

Access Paper or Ask Questions