Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Image": models, code, and papers

Streaming an image through the eye: The retina seen as a dithered scalable image coder

Feb 10, 2012
Khaled Masmoudi, Marc Antonini, Pierre Kornprobst

Figure 1 for Streaming an image through the eye: The retina seen as a dithered scalable image coder

Figure 2 for Streaming an image through the eye: The retina seen as a dithered scalable image coder

Figure 3 for Streaming an image through the eye: The retina seen as a dithered scalable image coder

Figure 4 for Streaming an image through the eye: The retina seen as a dithered scalable image coder

We propose the design of an original scalable image coder/decoder that is inspired from the mammalians retina. Our coder accounts for the time-dependent and also nondeterministic behavior of the actual retina. The present work brings two main contributions: As a first step, (i) we design a deterministic image coder mimicking most of the retinal processing stages and then (ii) we introduce a retinal noise in the coding process, that we model here as a dither signal, to gain interesting perceptual features. Regarding our first contribution, our main source of inspiration will be the biologically plausible model of the retina called Virtual Retina. The main novelty of this coder is to show that the time-dependent behavior of the retina cells could ensure, in an implicit way, scalability and bit allocation. Regarding our second contribution, we reconsider the inner layers of the retina. We emit a possible interpretation for the non-determinism observed by neurophysiologists in their output. For this sake, we model the retinal noise that occurs in these layers by a dither signal. The dithering process that we propose adds several interesting features to our image coder. The dither noise whitens the reconstruction error and decorrelates it from the input stimuli. Furthermore, integrating the dither noise in our coder allows a faster recognition of the fine details of the image during the decoding process. Our present paper goal is twofold. First, we aim at mimicking as closely as possible the retina for the design of a novel image coder while keeping encouraging performances. Second, we bring a new insight concerning the non-deterministic behavior of the retina.

* arXiv admin note: substantial text overlap with arXiv:1104.1550

Via

Access Paper or Ask Questions

Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Aug 04, 2019
Shir Gur, Lior Wolf, Lior Golgher, Pablo Blinder

Figure 1 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Figure 2 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Figure 3 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

Figure 4 for Unsupervised Microvascular Image Segmentation Using an Active Contours Mimicking Neural Network

The task of blood vessel segmentation in microscopy images is crucial for many diagnostic and research applications. However, vessels can look vastly different, depending on the transient imaging conditions, and collecting data for supervised training is laborious. We present a novel deep learning method for unsupervised segmentation of blood vessels. The method is inspired by the field of active contours and we introduce a new loss term, which is based on the morphological Active Contours Without Edges (ACWE) optimization method. The role of the morphological operators is played by novel pooling layers that are incorporated to the network's architecture.We demonstrate the challenges that are faced by previous supervised learning solutions, when the imaging conditions shift. Our unsupervised method is able to outperform such previous methods in both the labeled dataset, and when applied to similar but different datasets. Our code, as well as efficient PyTorch reimplementations of the baseline methods VesselNN and DeepVess is available on GitHub - https://github.com/shirgur/UMIS.

Via

Access Paper or Ask Questions

University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

Feb 27, 2020
Zhedong Zheng, Yunchao Wei, Yi Yang

Figure 1 for University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

Figure 2 for University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

Figure 3 for University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

Figure 4 for University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization

We consider the problem of cross-view geo-localization. The primary challenge of this task is to learn the robust feature against large viewpoint changes. Existing benchmarks can help, but are limited in the number of viewpoints. Image pairs, containing two viewpoints, e.g., satellite and ground, are usually provided, which may compromise the feature learning. Besides phone cameras and satellites, in this paper, we argue that drones could serve as the third platform to deal with the geo-localization problem. In contrast to the traditional ground-view images, drone-view images meet fewer obstacles, e.g., trees, and could provide a comprehensive view when flying around the target place. To verify the effectiveness of the drone platform, we introduce a new multi-view multi-source benchmark for drone-based geo-localization, named University-1652. University-1652 contains data from three platforms, i.e., synthetic drones, satellites and ground cameras of 1,652 university buildings around the world. To our knowledge, University-1652 is the first drone-based geo-localization dataset and enables two new tasks, i.e., drone-view target localization and drone navigation. As the name implies, drone-view target localization intends to predict the location of the target place via drone-view images. On the other hand, given a satellite-view query image, drone navigation is to drive the drone to the area of interest in the query. We use this dataset to analyze a variety of off-the-shelf CNN features and propose a strong CNN baseline on this challenging dataset. The experiments show that University-1652 helps the model to learn the viewpoint-invariant features and also has good generalization ability in the real-world scenario.

Via

Access Paper or Ask Questions

Local Adaptation Improves Accuracy of Deep Learning Model for Automated X-Ray Thoracic Disease Detection : A Thai Study

Apr 28, 2020
Isarun Chamveha, Trongtum Tongdee, Pairash Saiviroonporn, Warasinee Chaisangmongkon

Figure 1 for Local Adaptation Improves Accuracy of Deep Learning Model for Automated X-Ray Thoracic Disease Detection : A Thai Study

Figure 2 for Local Adaptation Improves Accuracy of Deep Learning Model for Automated X-Ray Thoracic Disease Detection : A Thai Study

Figure 3 for Local Adaptation Improves Accuracy of Deep Learning Model for Automated X-Ray Thoracic Disease Detection : A Thai Study

Figure 4 for Local Adaptation Improves Accuracy of Deep Learning Model for Automated X-Ray Thoracic Disease Detection : A Thai Study

Despite much promising research in the area of artificial intelligence for medical image diagnosis, there has been no large-scale validation study done in Thailand to confirm the accuracy and utility of such algorithms when applied to local datasets. Here we present a wide-reaching development and testing of a deep learning algorithm for automated thoracic disease detection, utilizing 421,859 local chest radiographs. Our study shows that convolutional neural networks can achieve remarkable performance in detecting 13 common abnormality conditions on chest X-ray, and the incorporation of local images into the training set is key to the model's success. This paper presents a state-of-the-art model for CXR abnormality detection, reaching an average AUROC of 0.91. This model, if integrated to the workflow, can result in up to 55.6% work reduction for medical practitioners in the CXR analysis process. Our work emphasizes the importance of investing in local research of medical diagnosis algorithms to ensure safe and efficient usage within the intended region.

* 9 pages, 2 figure, 3 tables

Via

Access Paper or Ask Questions

Multi-Scale Boosted Dehazing Network with Dense Feature Fusion

Apr 28, 2020
Hang Dong, Jinshan Pan, Lei Xiang, Zhe Hu, Xinyi Zhang, Fei Wang, Ming-Hsuan Yang

Figure 1 for Multi-Scale Boosted Dehazing Network with Dense Feature Fusion

Figure 2 for Multi-Scale Boosted Dehazing Network with Dense Feature Fusion

Figure 3 for Multi-Scale Boosted Dehazing Network with Dense Feature Fusion

Figure 4 for Multi-Scale Boosted Dehazing Network with Dense Feature Fusion

In this paper, we propose a Multi-Scale Boosted Dehazing Network with Dense Feature Fusion based on the U-Net architecture. The proposed method is designed based on two principles, boosting and error feedback, and we show that they are suitable for the dehazing problem. By incorporating the Strengthen-Operate-Subtract boosting strategy in the decoder of the proposed model, we develop a simple yet effective boosted decoder to progressively restore the haze-free image. To address the issue of preserving spatial information in the U-Net architecture, we design a dense feature fusion module using the back-projection feedback scheme. We show that the dense feature fusion module can simultaneously remedy the missing spatial information from high-resolution features and exploit the non-adjacent features. Extensive evaluations demonstrate that the proposed model performs favorably against the state-of-the-art approaches on the benchmark datasets as well as real-world hazy images.

* Accepted by CVPR 2020. The code are available at https://github.com/BookerDeWitt/MSBDN-DFF

Via

Access Paper or Ask Questions

It GAN DO Better: GAN-based Detection of Objects on Images with Varying Quality

Dec 03, 2019
Charan D. Prakash, Lina J. Karam

Figure 1 for It GAN DO Better: GAN-based Detection of Objects on Images with Varying Quality

Figure 2 for It GAN DO Better: GAN-based Detection of Objects on Images with Varying Quality

Figure 3 for It GAN DO Better: GAN-based Detection of Objects on Images with Varying Quality

Figure 4 for It GAN DO Better: GAN-based Detection of Objects on Images with Varying Quality

In this paper, we propose in our novel generative framework the use of Generative Adversarial Networks (GANs) to generate features that provide robustness for object detection on reduced quality images. The proposed GAN-based Detection of Objects (GAN-DO) framework is not restricted to any particular architecture and can be generalized to several deep neural network (DNN) based architectures. The resulting deep neural network maintains the exact architecture as the selected baseline model without adding to the model parameter complexity or inference speed. We first evaluate the effect of image quality not only on the object classification but also on the object bounding box regression. We then test the models resulting from our proposed GAN-DO framework, using two state-of-the-art object detection architectures as the baseline models. We also evaluate the effect of the number of re-trained parameters in the generator of GAN-DO on the accuracy of the final trained model. Performance results provided using GAN-DO on object detection datasets establish an improved robustness to varying image quality and a higher mAP compared to the existing approaches.

Via

Access Paper or Ask Questions

Privacy-Aware Activity Classification from First Person Office Videos

Jun 11, 2020
Partho Ghosh, Md. Abrar Istiak, Nayeeb Rashid, Ahsan Habib Akash, Ridwan Abrar, Ankan Ghosh Dastider, Asif Shahriyar Sushmit, Taufiq Hasan

Figure 1 for Privacy-Aware Activity Classification from First Person Office Videos

Figure 2 for Privacy-Aware Activity Classification from First Person Office Videos

Figure 3 for Privacy-Aware Activity Classification from First Person Office Videos

Figure 4 for Privacy-Aware Activity Classification from First Person Office Videos

In the advent of wearable body-cameras, human activity classification from First-Person Videos (FPV) has become a topic of increasing importance for various applications, including in life-logging, law-enforcement, sports, workplace, and healthcare. One of the challenging aspects of FPV is its exposure to potentially sensitive objects within the user's field of view. In this work, we developed a privacy-aware activity classification system focusing on office videos. We utilized a Mask-RCNN with an Inception-ResNet hybrid as a feature extractor for detecting, and then blurring out sensitive objects (e.g., digital screens, human face, paper) from the videos. For activity classification, we incorporate an ensemble of Recurrent Neural Networks (RNNs) with ResNet, ResNext, and DenseNet based feature extractors. The proposed system was trained and evaluated on the FPV office video dataset that includes 18-classes made available through the IEEE Video and Image Processing (VIP) Cup 2019 competition. On the original unprotected FPVs, the proposed activity classifier ensemble reached an accuracy of 85.078% with precision, recall, and F1 scores of 0.88, 0.85 & 0.86, respectively. On privacy protected videos, the performances were slightly degraded, with accuracy, precision, recall, and F1 scores at 73.68%, 0.79, 0.75, and 0.74, respectively. The presented system won the 3rd prize in the IEEE VIP Cup 2019 competition.

Via

Access Paper or Ask Questions

Maximum entropy methods for texture synthesis: theory and practice

Dec 03, 2019
Valentin De Bortoli, Agnes Desolneux, Alain Durmus, Bruno Galerne, Arthur Leclaire

Figure 1 for Maximum entropy methods for texture synthesis: theory and practice

Figure 2 for Maximum entropy methods for texture synthesis: theory and practice

Figure 3 for Maximum entropy methods for texture synthesis: theory and practice

Figure 4 for Maximum entropy methods for texture synthesis: theory and practice

Recent years have seen the rise of convolutional neural network techniques in exemplar-based image synthesis. These methods often rely on the minimization of some variational formulation on the image space for which the minimizers are assumed to be the solutions of the synthesis problem. In this paper we investigate, both theoretically and experimentally, another framework to deal with this problem using an alternate sampling/minimization scheme. First, we use results from information geometry to assess that our method yields a probability measure which has maximum entropy under some constraints in expectation. Then, we turn to the analysis of our method and we show, using recent results from the Markov chain literature, that its error can be explicitly bounded with constants which depend polynomially in the dimension even in the non-convex setting. This includes the case where the constraints are defined via a differentiable neural network. Finally, we present an extensive experimental study of the model, including a comparison with state-of-the-art methods and an extension to style transfer.

Via

Access Paper or Ask Questions

Where is the Fake? Patch-Wise Supervised GANs for Texture Inpainting

Nov 06, 2019
Ahmed Ben Saad, Youssef Tamaazousti, Josselin Kherroubi, Alexis He

Figure 1 for Where is the Fake? Patch-Wise Supervised GANs for Texture Inpainting

Figure 2 for Where is the Fake? Patch-Wise Supervised GANs for Texture Inpainting

Figure 3 for Where is the Fake? Patch-Wise Supervised GANs for Texture Inpainting

Figure 4 for Where is the Fake? Patch-Wise Supervised GANs for Texture Inpainting

We tackle the problem of texture inpainting where the input images are textures with missing values along with masks that indicate the zones that should be generated. Many works have been done in image inpainting with the aim to achieve global and local consistency. But these works still suffer from limitations when dealing with textures. In fact, the local information in the image to be completed needs to be used in order to achieve local continuities and visually realistic texture inpainting. For this, we propose a new segmentor discriminator that performs a patch-wise real/fake classification and is supervised by input masks. During training, it aims to locate the fake and thus backpropagates consistent signal to the generator. We tested our approach on the publicly available DTD dataset and showed that it achieves state-of-the-art performances and better deals with local consistency than existing methods.

* Submitted to ICASSP 2020

Via

Access Paper or Ask Questions

The Edge of Depth: Explicit Constraints between Segmentation and Depth

Apr 01, 2020
Shengjie Zhu, Garrick Brazil, Xiaoming Liu

Figure 1 for The Edge of Depth: Explicit Constraints between Segmentation and Depth

Figure 2 for The Edge of Depth: Explicit Constraints between Segmentation and Depth

Figure 3 for The Edge of Depth: Explicit Constraints between Segmentation and Depth

Figure 4 for The Edge of Depth: Explicit Constraints between Segmentation and Depth

In this work we study the mutual benefits of two common computer vision tasks, self-supervised depth estimation and semantic segmentation from images. For example, to help unsupervised monocular depth estimation, constraints from semantic segmentation has been explored implicitly such as sharing and transforming features. In contrast, we propose to explicitly measure the border consistency between segmentation and depth and minimize it in a greedy manner by iteratively supervising the network towards a locally optimal solution. Partially this is motivated by our observation that semantic segmentation even trained with limited ground truth (200 images of KITTI) can offer more accurate border than that of any (monocular or stereo) image-based depth estimation. Through extensive experiments, our proposed approach advances the state of the art on unsupervised monocular depth estimation in the KITTI.

Via

Access Paper or Ask Questions