Abstract:The deep learning research in computer vision has been growing extremely fast in the past decade, many of which have been translated into novel image analysis methods for biomedical problems. Broadly speaking, many deep learning based biomedical image analysis methods can be considered as a general image-to-image transformation framework. In this work, we introduce a new open source python package MMV_Im2Im for image-to-image transformation in bioimaging applications. The overall package is designed with a generic image-to-image transformation framework, which could be directly used for semantic segmentation, instance segmentation, image restoration, image generation, etc.. The implementation takes advantage of the state-of-the-art machine learning engineering techniques for users to focus on the research without worrying about the engineering details. We demonstrate the effectiveness of MMV_Im2Im in more than ten different biomedical problems. For biomedical machine learning researchers, we hope this new package could serve as the starting point for their specific problems to stimulate new biomedical image analysis or machine learning methods. For experimental biomedical researchers, we hope this work can provide a holistic view of the image-to-image transformation concept with diverse examples, so that deep learning based image-to-image transformation could be further integrated into the assay development process and permit new biomedical studies that can hardly be done only with traditional experimental methods. Source code can be found at https://github.com/MMV-Lab/mmv_im2im.
Abstract:Deep learning (DL) based semantic segmentation methods have achieved excellent performance in biomedical image segmentation, producing high quality probability maps to allow extraction of rich instance information to facilitate good instance segmentation. While numerous efforts were put into developing new DL semantic segmentation models, less attention was paid to a key issue of how to effectively explore their probability maps to attain the best possible instance segmentation. We observe that probability maps by DL semantic segmentation models can be used to generate many possible instance candidates, and accurate instance segmentation can be achieved by selecting from them a set of "optimized" candidates as output instances. Further, the generated instance candidates form a well-behaved hierarchical structure (a forest), which allows selecting instances in an optimized manner. Hence, we propose a novel framework, called hierarchical earth mover's distance (H-EMD), for instance segmentation in biomedical 2D+time videos and 3D images, which judiciously incorporates consistent instance selection with semantic-segmentation-generated probability maps. H-EMD contains two main stages. (1) Instance candidate generation: capturing instance-structured information in probability maps by generating many instance candidates in a forest structure. (2) Instance candidate selection: selecting instances from the candidate set for final instance segmentation. We formulate a key instance selection problem on the instance candidate forest as an optimization problem based on the earth mover's distance (EMD), and solve it by integer linear programming. Extensive experiments on eight biomedical video or 3D datasets demonstrate that H-EMD consistently boosts DL semantic segmentation models and is highly competitive with state-of-the-art methods.
Abstract:Automatic histopathology image segmentation is crucial to disease analysis. Limited available labeled data hinders the generalizability of trained models under the fully supervised setting. Semi-supervised learning (SSL) based on generative methods has been proven to be effective in utilizing diverse image characteristics. However, it has not been well explored what kinds of generated images would be more useful for model training and how to use such images. In this paper, we propose a new data guided generative method for histopathology image segmentation by leveraging the unlabeled data distributions. First, we design an image generation module. Image content and style are disentangled and embedded in a clustering-friendly space to utilize their distributions. New images are synthesized by sampling and cross-combining contents and styles. Second, we devise an effective data selection policy for judiciously sampling the generated images: (1) to make the generated training set better cover the dataset, the clusters that are underrepresented in the original training set are covered more; (2) to make the training process more effective, we identify and oversample the images of "hard cases" in the data for which annotated training data may be scarce. Our method is evaluated on glands and nuclei datasets. We show that under both the inductive and transductive settings, our SSL method consistently boosts the performance of common segmentation models and attains state-of-the-art results.
Abstract:The Encoder-Decoder architecture is a main stream deep learning model for biomedical image segmentation. The encoder fully compresses the input and generates encoded features, and the decoder then produces dense predictions using encoded features. However, decoders are still under-explored in such architectures. In this paper, we comprehensively study the state-of-the-art Encoder-Decoder architectures, and propose a new universal decoder, called cascade decoder, to improve semantic segmentation accuracy. Our cascade decoder can be embedded into existing networks and trained altogether in an end-to-end fashion. The cascade decoder structure aims to conduct more effective decoding of hierarchically encoded features and is more compatible with common encoders than the known decoders. We replace the decoders of state-of-the-art models with our cascade decoder for several challenging biomedical image segmentation tasks, and the considerable improvements achieved demonstrate the efficacy of our new decoding method.
Abstract:Wing disc pouches of fruit flies are a powerful genetic model for studying physiological intercellular calcium ($Ca^{2+}$) signals for dynamic analysis of cell signaling in organ development and disease studies. A key to analyzing spatial-temporal patterns of $Ca^{2+}$ signal waves is to accurately align the pouches across image sequences. However, pouches in different image frames may exhibit extensive intensity oscillations due to $Ca^{2+}$ signaling dynamics, and commonly used multimodal non-rigid registration methods may fail to achieve satisfactory results. In this paper, we develop a new two-phase non-rigid registration approach to register pouches in image sequences. First, we conduct segmentation of the region of interest. (i.e., pouches) using a deep neural network model. Second, we obtain an optimal transformation and align pouches across the image sequences. Evaluated using both synthetic data and real pouch data, our method considerably outperforms the state-of-the-art non-rigid registration methods.
Abstract:Image segmentation is a fundamental problem in biomedical image analysis. Recent advances in deep learning have achieved promising results on many biomedical image segmentation benchmarks. However, due to large variations in biomedical images (different modalities, image settings, objects, noise, etc), to utilize deep learning on a new application, it usually needs a new set of training data. This can incur a great deal of annotation effort and cost, because only biomedical experts can annotate effectively, and often there are too many instances in images (e.g., cells) to annotate. In this paper, we aim to address the following question: With limited effort (e.g., time) for annotation, what instances should be annotated in order to attain the best performance? We present a deep active learning framework that combines fully convolutional network (FCN) and active learning to significantly reduce annotation effort by making judicious suggestions on the most effective annotation areas. We utilize uncertainty and similarity information provided by FCN and formulate a generalized version of the maximum set cover problem to determine the most representative and uncertain areas for annotation. Extensive experiments using the 2015 MICCAI Gland Challenge dataset and a lymph node ultrasound image segmentation dataset show that, using annotation suggestions by our method, state-of-the-art segmentation performance can be achieved by using only 50% of training data.
Abstract:In this paper, we consider the problem of automatically segmenting neuronal cells in dual-color confocal microscopy images. This problem is a key task in various quantitative analysis applications in neuroscience, such as tracing cell genesis in Danio rerio (zebrafish) brains. Deep learning, especially using fully convolutional networks (FCN), has profoundly changed segmentation research in biomedical imaging. We face two major challenges in this problem. First, neuronal cells may form dense clusters, making it difficult to correctly identify all individual cells (even to human experts). Consequently, segmentation results of the known FCN-type models are not accurate enough. Second, pixel-wise ground truth is difficult to obtain. Only a limited amount of approximate instance-wise annotation can be collected, which makes the training of FCN models quite cumbersome. We propose a new FCN-type deep learning model, called deep complete bipartite networks (CB-Net), and a new scheme for leveraging approximate instance-wise annotation to train our pixel-wise prediction model. Evaluated using seven real datasets, our proposed new CB-Net model outperforms the state-of-the-art FCN models and produces neuron segmentation results of remarkable quality
Abstract:Convolution is a fundamental operation in many applications, such as computer vision, natural language processing, image processing, etc. Recent successes of convolutional neural networks in various deep learning applications put even higher demand on fast convolution. The high computation throughput and memory bandwidth of graphics processing units (GPUs) make GPUs a natural choice for accelerating convolution operations. However, maximally exploiting the available memory bandwidth of GPUs for convolution is a challenging task. This paper introduces a general model to address the mismatch between the memory bank width of GPUs and computation data width of threads. Based on this model, we develop two convolution kernels, one for the general case and the other for a special case with one input channel. By carefully optimizing memory access patterns and computation patterns, we design a communication-optimized kernel for the special case and a communication-reduced kernel for the general case. Experimental data based on implementations on Kepler GPUs show that our kernels achieve 5.16X and 35.5% average performance improvement over the latest cuDNN library, for the special case and the general case, respectively.
Abstract:Automatic detection of lymphocyte in H&E images is a necessary first step in lots of tissue image analysis algorithms. An accurate and robust automated lymphocyte detection approach is of great importance in both computer science and clinical studies. Most of the existing approaches for lymphocyte detection are based on traditional image processing algorithms and/or classic machine learning methods. In the recent years, deep learning techniques have fundamentally transformed the way that a computer interprets images and have become a matchless solution in various pattern recognition problems. In this work, we design a new deep neural network model which extends the fully convolutional network by combining the ideas in several recent techniques, such as shortcut links. Also, we design a new training scheme taking the prior knowledge about lymphocytes into consideration. The training scheme not only efficiently exploits the limited amount of free-form annotations from pathologists, but also naturally supports efficient fine-tuning. As a consequence, our model has the potential of self-improvement by leveraging the errors collected during real applications. Our experiments show that our deep neural network model achieves good performance in the images of different staining conditions or different types of tissues.
Abstract:Segmentation of 3D images is a fundamental problem in biomedical image analysis. Deep learning (DL) approaches have achieved state-of-the-art segmentation perfor- mance. To exploit the 3D contexts using neural networks, known DL segmentation methods, including 3D convolution, 2D convolution on planes orthogonal to 2D image slices, and LSTM in multiple directions, all suffer incompatibility with the highly anisotropic dimensions in common 3D biomedical images. In this paper, we propose a new DL framework for 3D image segmentation, based on a com- bination of a fully convolutional network (FCN) and a recurrent neural network (RNN), which are responsible for exploiting the intra-slice and inter-slice contexts, respectively. To our best knowledge, this is the first DL framework for 3D image segmentation that explicitly leverages 3D image anisotropism. Evaluating using a dataset from the ISBI Neuronal Structure Segmentation Challenge and in-house image stacks for 3D fungus segmentation, our approach achieves promising results comparing to the known DL-based 3D segmentation approaches.