Object detection is considered one of the most challenging problems in this field of computer vision, as it involves the combination of object classification and object localization within a scene. Recently, deep neural networks (DNNs) have been demonstrated to achieve superior object detection performance compared to other approaches, with YOLOv2 (an improved You Only Look Once model) being one of the state-of-the-art in DNN-based object detection methods in terms of both speed and accuracy. Although YOLOv2 can achieve real-time performance on a powerful GPU, it still remains very challenging for leveraging this approach for real-time object detection in video on embedded computing devices with limited computational power and limited memory. In this paper, we propose a new framework called Fast YOLO, a fast You Only Look Once framework which accelerates YOLOv2 to be able to perform object detection in video on embedded devices in a real-time manner. First, we leverage the evolutionary deep intelligence framework to evolve the YOLOv2 network architecture and produce an optimized architecture (referred to as O-YOLOv2 here) that has 2.8X fewer parameters with just a ~2% IOU drop. To further reduce power consumption on embedded devices while maintaining performance, a motion-adaptive inference method is introduced into the proposed Fast YOLO framework to reduce the frequency of deep inference with O-YOLOv2 based on temporal motion characteristics. Experimental results show that the proposed Fast YOLO framework can reduce the number of deep inferences by an average of 38.13%, and an average speedup of ~3.3X for objection detection in video compared to the original YOLOv2, leading Fast YOLO to run an average of ~18FPS on a Nvidia Jetson TX1 embedded system.
Evolutionary deep intelligence was recently proposed as a method for achieving highly efficient deep neural network architectures over successive generations. Drawing inspiration from nature, we propose the incorporation of sexual evolutionary synthesis. Rather than the current asexual synthesis of networks, we aim to produce more compact feature representations by synthesizing more diverse and generalizable offspring networks in subsequent generations via the combination of two parent networks. Experimental results were obtained using the MNIST and CIFAR-10 datasets, and showed improved architectural efficiency and comparable testing accuracy relative to the baseline asexual evolutionary neural networks. In particular, the network synthesized via sexual evolutionary synthesis for MNIST had approximately double the architectural efficiency (cluster efficiency of 34.29X and synaptic efficiency of 258.37X) in comparison to the network synthesized via asexual evolutionary synthesis, with both networks achieving a testing accuracy of ~97%.
A key contributing factor to incredible success of deep neural networks has been the significant rise on massively parallel computing devices allowing researchers to greatly increase the size and depth of deep neural networks, leading to significant improvements in modeling accuracy. Although deeper, larger, or complex deep neural networks have shown considerable promise, the computational complexity of such networks is a major barrier to utilization in resource-starved scenarios. We explore the synaptogenesis of deep neural networks in the formation of efficient deep neural network architectures within an evolutionary deep intelligence framework, where a probabilistic generative modeling strategy is introduced to stochastically synthesize increasingly efficient yet effective offspring deep neural networks over generations, mimicking evolutionary processes such as heredity, random mutation, and natural selection in a probabilistic manner. In this study, we primarily explore the imposition of synaptic precision restrictions and its impact on the evolutionary synthesis of deep neural networks to synthesize more efficient network architectures tailored for resource-starved scenarios. Experimental results show significant improvements in synaptic efficiency (~10X decrease for GoogLeNet-based DetectNet) and inference speed (>5X increase for GoogLeNet-based DetectNet) while preserving modeling accuracy.
A promising paradigm for achieving highly efficient deep neural networks is the idea of evolutionary deep intelligence, which mimics biological evolution processes to progressively synthesize more efficient networks. A crucial design factor in evolutionary deep intelligence is the genetic encoding scheme used to simulate heredity and determine the architectures of offspring networks. In this study, we take a deeper look at the notion of synaptic cluster-driven evolution of deep neural networks which guides the evolution process towards the formation of a highly sparse set of synaptic clusters in offspring networks. Utilizing a synaptic cluster-driven genetic encoding, the probabilistic encoding of synaptic traits considers not only individual synaptic properties but also inter-synaptic relationships within a deep neural network. This process results in highly sparse offspring networks which are particularly tailored for parallel computational devices such as GPUs and deep neural network accelerator chips. Comprehensive experimental results using four well-known deep neural network architectures (LeNet-5, AlexNet, ResNet-56, and DetectNet) on two different tasks (object categorization and object detection) demonstrate the efficiency of the proposed method. Cluster-driven genetic encoding scheme synthesizes networks that can achieve state-of-the-art performance with significantly smaller number of synapses than that of the original ancestor network. ($\sim$125-fold decrease in synapses for MNIST). Furthermore, the improved cluster efficiency in the generated offspring networks ($\sim$9.71-fold decrease in clusters for MNIST and a $\sim$8.16-fold decrease in clusters for KITTI) is particularly useful for accelerated performance on parallel computing hardware architectures such as those in GPUs and deep neural network accelerator chips.
Lung cancer is the leading cause for cancer related deaths. As such, there is an urgent need for a streamlined process that can allow radiologists to provide diagnosis with greater efficiency and accuracy. A powerful tool to do this is radiomics: a high-dimension imaging feature set. In this study, we take the idea of radiomics one step further by introducing the concept of discovery radiomics for lung cancer prediction using CT imaging data. In this study, we realize these custom radiomic sequencers as deep convolutional sequencers using a deep convolutional neural network learning architecture. To illustrate the prognostic power and effectiveness of the radiomic sequences produced by the discovered sequencer, we perform cancer prediction between malignant and benign lesions from 97 patients using the pathologically-proven diagnostic data from the LIDC-IDRI dataset. Using the clinically provided pathologically-proven data as ground truth, the proposed framework provided an average accuracy of 77.52% via 10-fold cross-validation with a sensitivity of 79.06% and specificity of 76.11%, surpassing the state-of-the art method.
Taking inspiration from biological evolution, we explore the idea of "Can deep neural networks evolve naturally over successive generations into highly efficient deep neural networks?" by introducing the notion of synthesizing new highly efficient, yet powerful deep neural networks over successive generations via an evolutionary process from ancestor deep neural networks. The architectural traits of ancestor deep neural networks are encoded using synaptic probability models, which can be viewed as the `DNA' of these networks. New descendant networks with differing network architectures are synthesized based on these synaptic probability models from the ancestor networks and computational environmental factor models, in a random manner to mimic heredity, natural selection, and random mutation. These offspring networks are then trained into fully functional networks, like one would train a newborn, and have more efficient, more diverse network architectures than their ancestor networks, while achieving powerful modeling capabilities. Experimental results for the task of visual saliency demonstrated that the synthesized `evolved' offspring networks can achieve state-of-the-art performance while having network architectures that are significantly more efficient (with a staggering $\sim$48-fold decrease in synapses by the fourth generation) compared to the original ancestor network.
There has been significant recent interest towards achieving highly efficient deep neural network architectures. A promising paradigm for achieving this is the concept of evolutionary deep intelligence, which attempts to mimic biological evolution processes to synthesize highly-efficient deep neural networks over successive generations. An important aspect of evolutionary deep intelligence is the genetic encoding scheme used to mimic heredity, which can have a significant impact on the quality of offspring deep neural networks. Motivated by the neurobiological phenomenon of synaptic clustering, we introduce a new genetic encoding scheme where synaptic probability is driven towards the formation of a highly sparse set of synaptic clusters. Experimental results for the task of image classification demonstrated that the synthesized offspring networks using this synaptic cluster-driven genetic encoding scheme can achieve state-of-the-art performance while having network architectures that are not only significantly more efficient (with a ~125-fold decrease in synapses for MNIST) compared to the original ancestor network, but also tailored for GPU-accelerated machine learning applications.
Diffusion weighted magnetic resonance imaging (DW-MR) is a powerful tool in imaging-based prostate cancer screening and detection. Endorectal coils are commonly used in DW-MR imaging to improve the signal-to-noise ratio (SNR) of the acquisition, at the expense of significant intensity inhomogeneities (bias field) that worsens as we move away from the endorectal coil. The presence of bias field can have a significant negative impact on the accuracy of different image analysis tasks, as well as prostate tumor localization, thus leading to increased inter- and intra-observer variability. Retrospective bias correction approaches are introduced as a more efficient way of bias correction compared to the prospective methods such that they correct for both of the scanner and anatomy-related bias fields in MR imaging. Previously proposed retrospective bias field correction methods suffer from undesired noise amplification that can reduce the quality of bias-corrected DW-MR image. Here, we propose a unified data reconstruction approach that enables joint compensation of bias field as well as data noise in DW-MR imaging. The proposed noise-compensated, bias-corrected (NCBC) data reconstruction method takes advantage of a novel stochastically fully connected joint conditional random field (SFC-JCRF) model to mitigate the effects of data noise and bias field in the reconstructed MR data. The proposed NCBC reconstruction method was tested on synthetic DW-MR data, physical DW-phantom as well as real DW-MR data all acquired using endorectal MR coil. Both qualitative and quantitative analysis illustrated that the proposed NCBC method can achieve improved image quality when compared to other tested bias correction methods. As such, the proposed NCBC method may have potential as a useful retrospective approach for improving the consistency of image interpretations.
The problem of automated crowd segmentation and counting has garnered significant interest in the field of video surveillance. This paper proposes a novel scene invariant crowd segmentation and counting algorithm designed with high accuracy yet low computational complexity in mind, which is key for widespread industrial adoption. A novel low-complexity, scale-normalized feature called Histogram of Moving Gradients (HoMG) is introduced for highly effective spatiotemporal representation of individuals and crowds within a video. Real-time crowd segmentation is achieved via boosted cascade of weak classifiers based on sliding-window HoMG features, while linear SVM regression of crowd-region HoMG features is employed for real-time crowd counting. Experimental results using multi-camera crowd datasets show that the proposed algorithm significantly outperform state-of-the-art crowd counting algorithms, as well as achieve very promising crowd segmentation results, thus demonstrating the efficacy of the proposed method for highly-accurate, real-time video-driven crowd analysis.
Magnetic Resonance Imaging (MRI) is a crucial medical imaging technology for the screening and diagnosis of frequently occurring cancers. However image quality may suffer by long acquisition times for MRIs due to patient motion, as well as result in great patient discomfort. Reducing MRI acquisition time can reduce patient discomfort and as a result reduces motion artifacts from the acquisition process. Compressive sensing strategies, when applied to MRI, have been demonstrated to be effective at decreasing acquisition times significantly by sparsely sampling the \emph{k}-space during the acquisition process. However, such a strategy requires advanced reconstruction algorithms to produce high quality and reliable images from compressive sensing MRI. This paper proposes a new reconstruction approach based on cross-domain stochastically fully connected conditional random fields (CD-SFCRF) for compressive sensing MRI. The CD-SFCRF introduces constraints in both \emph{k}-space and spatial domains within a stochastically fully connected graphical model to produce improved MRI reconstruction. Experimental results using T2-weighted (T2w) imaging and diffusion-weighted imaging (DWI) of the prostate show strong performance in preserving fine details and tissue structures in the reconstructed images when compared to other tested methods even at low sampling rates.