Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Image": models, code, and papers

Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Mar 02, 2021
Yukun Su, Ruizhou Sun, Guosheng Lin, Qingyao Wu

Figure 1 for Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Figure 2 for Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Figure 3 for Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Figure 4 for Context Decoupling Augmentation for Weakly Supervised Semantic Segmentation

Data augmentation is vital for deep learning neural networks. By providing massive training samples, it helps to improve the generalization ability of the model. Weakly supervised semantic segmentation (WSSS) is a challenging problem that has been deeply studied in recent years, conventional data augmentation approaches for WSSS usually employ geometrical transformations, random cropping and color jittering. However, merely increasing the same contextual semantic data does not bring much gain to the networks to distinguish the objects, e.g., the correct image-level classification of "aeroplane" may be not only due to the recognition of the object itself, but also its co-occurrence context like "sky", which will cause the model to focus less on the object features. To this end, we present a Context Decoupling Augmentation (CDA) method, to change the inherent context in which the objects appear and thus drive the network to remove the dependence between object instances and contextual information. To validate the effectiveness of the proposed method, extensive experiments on PASCAL VOC 2012 dataset with several alternative network architectures demonstrate that CDA can boost various popular WSSS methods to the new state-of-the-art by a large margin.

* 10 pages, 7 figures

Via

Access Paper or Ask Questions

Noise2Inpaint: Learning Referenceless Denoising by Inpainting Unrolling

Jun 16, 2020
Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Mehmet Akçakaya

Figure 1 for Noise2Inpaint: Learning Referenceless Denoising by Inpainting Unrolling

Figure 2 for Noise2Inpaint: Learning Referenceless Denoising by Inpainting Unrolling

Figure 3 for Noise2Inpaint: Learning Referenceless Denoising by Inpainting Unrolling

Figure 4 for Noise2Inpaint: Learning Referenceless Denoising by Inpainting Unrolling

Deep learning based image denoising methods have been recently popular due to their improved performance. Traditionally, these methods are trained in a supervised manner, requiring a set of noisy input and clean target image pairs. More recently, self-supervised approaches have been proposed to learn denoising from noisy images only, without requiring clean ground truth during training. Succinctly, these methods assume that an image pixel is correlated with its neighboring pixels, while the noise is independent. In this work, building on these approaches and recent methods from image reconstruction, we introduce Noise2Inpaint (N2I), a training approach that recasts the denoising problem into a regularized image inpainting framework. This allows us to use an objective function, which can incorporate different statistical properties of the noise as needed. We use algorithm unrolling to unroll an iterative optimization for solving this objective function and train the unrolled network end-to-end. The training is self-supervised without requiring clean target images, where pixels in the noisy image are split into two disjoint sets. One of these is used to impose data fidelity in the unrolled network, while the other one defines the loss. We demonstrate that N2I performs successful denoising on real-world datasets, while preserving better details compared to its self-supervised counterpart Noise2Void.

Via

Access Paper or Ask Questions

Generative Collaborative Networks for Single Image Super-Resolution

Mar 12, 2019
Mohamed El Amine Seddik, Mohamed Tamaazousti, John Lin

Figure 1 for Generative Collaborative Networks for Single Image Super-Resolution

Figure 2 for Generative Collaborative Networks for Single Image Super-Resolution

Figure 3 for Generative Collaborative Networks for Single Image Super-Resolution

Figure 4 for Generative Collaborative Networks for Single Image Super-Resolution

A common issue of deep neural networks-based methods for the problem of Single Image Super-Resolution (SISR), is the recovery of finer texture details when super-resolving at large upscaling factors. This issue is particularly related to the choice of the objective loss function. In particular, recent works proposed the use of a VGG loss which consists in minimizing the error between the generated high resolution images and ground-truth in the feature space of a Convolutional Neural Network (VGG19), pre-trained on the very "large" ImageNet dataset. When considering the problem of super-resolving images with a distribution "far" from the ImageNet images distribution (\textit{e.g.,} satellite images), their proposed \textit{fixed} VGG loss is no longer relevant. In this paper, we present a general framework named \textit{Generative Collaborative Networks} (GCN), where the idea consists in optimizing the \textit{generator} (the mapping of interest) in the feature space of a \textit{features extractor} network. The two networks (generator and extractor) are \textit{collaborative} in the sense that the latter "helps" the former, by constructing discriminative and relevant features (not necessarily \textit{fixed} and possibly learned \textit{mutually} with the generator). We evaluate the GCN framework in the context of SISR, and we show that it results in a method that is adapted to super-resolution domains that are "far" from the ImageNet domain.

Via

Access Paper or Ask Questions

Image based cellular contractile force evaluation with small-world network inspired CNN: SW-UNet

Aug 23, 2019
Li Honghan, Daiki Matsunaga, Tsubasa S. Matsui, Hiroki Aosaki, Shinji Deguchi

Figure 1 for Image based cellular contractile force evaluation with small-world network inspired CNN: SW-UNet

Figure 2 for Image based cellular contractile force evaluation with small-world network inspired CNN: SW-UNet

Figure 3 for Image based cellular contractile force evaluation with small-world network inspired CNN: SW-UNet

Figure 4 for Image based cellular contractile force evaluation with small-world network inspired CNN: SW-UNet

We propose an image-based cellular contractile force evaluation method using a machine learning technique. We use a special substrate that exhibits wrinkles when cells grab the substrate and contract, and the wrinkles can be used to visualize the force magnitude and direction. In order to extract wrinkles from the microscope images, we develop a new CNN (convolutional neural network) architecture SW-UNet (small-world U-Net), which is a CNN that reflects the concept of the small-world network. The SW-UNet shows better performance in wrinkle segmentation task compared to other methods: the error (Euclidean distance) of SW-UNet is 4.9 times smaller than 2D-FFT (fast Fourier transform) based segmentation approach, and is 2.9 times smaller than U-Net. As a demonstration, we compare the contractile force of U2OS (human osteosarcoma) cells and show that cells with a mutation in the KRAS oncogne show larger force compared to the wild-type cells. Our new machine learning based algorithm provides us an efficient, automated and accurate method to evaluate the cell contractile force.

Via

Access Paper or Ask Questions

Efficient Near-Field Imaging Using Cylindrical MIMO Arrays

Jan 22, 2021
Shiyong Li, Shuoguang Wang, Moeness G. Amin, Guoqiang Zhao

Figure 1 for Efficient Near-Field Imaging Using Cylindrical MIMO Arrays

Figure 2 for Efficient Near-Field Imaging Using Cylindrical MIMO Arrays

Figure 3 for Efficient Near-Field Imaging Using Cylindrical MIMO Arrays

Figure 4 for Efficient Near-Field Imaging Using Cylindrical MIMO Arrays

Multiple-input multiple-output (MIMO) array based millimeter-wave (MMW) imaging has a tangible prospect in applications of concealed weapons detection. A near-field imaging algorithm based on wavenumber domain processing is proposed for a cylindrical MIMO array scheme with uniformly spaced transmit and receive antennas over both the vertical and horizontal-arc directions. The spectrum aliasing associated with the proposed MIMO array is analyzed through a zero-filling discrete-time Fourier transform. The analysis shows that an undersampled array can be used in recovering the MMW image by a wavenumber domain algorithm. The requirements for the antenna inter-element spacing of the MIMO array are delineated. Numerical simulations as well as comparisons with the backprojection (BP) algorithm are provided to demonstrate the effectiveness of the proposed method.

* 10 pages, 20 figures, paper submitted to IEEE Transactions on Aerospace and Electronic Systems

Via

Access Paper or Ask Questions

Spatial Attention Point Network for Deep-learning-based Robust Autonomous Robot Motion Generation

Mar 02, 2021
Hideyuki Ichiwara, Hiroshi Ito, Kenjiro Yamamoto, Hiroki Mori, Tetsuya Ogata

Figure 1 for Spatial Attention Point Network for Deep-learning-based Robust Autonomous Robot Motion Generation

Figure 2 for Spatial Attention Point Network for Deep-learning-based Robust Autonomous Robot Motion Generation

Figure 3 for Spatial Attention Point Network for Deep-learning-based Robust Autonomous Robot Motion Generation

Figure 4 for Spatial Attention Point Network for Deep-learning-based Robust Autonomous Robot Motion Generation

Deep learning provides a powerful framework for automated acquisition of complex robotic motions. However, despite a certain degree of generalization, the need for vast amounts of training data depending on the work-object position is an obstacle to industrial applications. Therefore, a robot motion-generation model that can respond to a variety of work-object positions with a small amount of training data is necessary. In this paper, we propose a method robust to changes in object position by automatically extracting spatial attention points in the image for the robot task and generating motions on the basis of their positions. We demonstrate our method with an LBR iiwa 7R1400 robot arm on a picking task and a pick-and-place task at various positions in various situations. In each task, the spatial attention points are obtained for the work objects that are important to the task. Our method is robust to changes in object position. Further, it is robust to changes in background, lighting, and obstacles that are not important to the task because it only focuses on positions that are important to the task.

Via

Access Paper or Ask Questions

An Objective Evaluation Metric for image fusion based on Del Operator

May 19, 2019
Ali A. Kiaei, Hassan Khotanlou, Paniz Kiaei, Yasin Bhrouzi, Mahdi Abbasi

Figure 1 for An Objective Evaluation Metric for image fusion based on Del Operator

Figure 2 for An Objective Evaluation Metric for image fusion based on Del Operator

Figure 3 for An Objective Evaluation Metric for image fusion based on Del Operator

Figure 4 for An Objective Evaluation Metric for image fusion based on Del Operator

In this paper, a novel objective evaluation metric for image fusion is presented. Remarkable and attractive points of the proposed metric are that it has no parameter, the result is probability in the range of [0, 1] and it is free from illumination dependence. This metric is easy to implement and the result is computed in four steps: (1) Smoothing the images using Gaussian filter. (2) Transforming images to a vector field using Del operator. (3) Computing the normal distribution function ({\mu},{\sigma}) for each corresponding pixel, and converting to the standard normal distribution function. (4) Computing the probability of being well-behaved fusion method as the result. To judge the quality of the proposed metric, it is compared to thirteen well-known non-reference objective evaluation metrics, where eight fusion methods are employed on seven experiments of multimodal medical images. The experimental results and statistical comparisons show that in contrast to the previously objective evaluation metrics the proposed one performs better in terms of both agreeing with human visual perception and evaluating fusion methods that are not performed at the same level.

* 22 pages, 14 Figures

Via

Access Paper or Ask Questions

Face Morphing Attack Generation & Detection: A Comprehensive Survey

Nov 03, 2020
Sushma Venkatesh, Raghavendra Ramachandra, Kiran Raja, Christoph Busch

Figure 1 for Face Morphing Attack Generation & Detection: A Comprehensive Survey

Figure 2 for Face Morphing Attack Generation & Detection: A Comprehensive Survey

Figure 3 for Face Morphing Attack Generation & Detection: A Comprehensive Survey

Figure 4 for Face Morphing Attack Generation & Detection: A Comprehensive Survey

The vulnerability of Face Recognition System (FRS) to various kind of attacks (both direct and in-direct attacks) and face morphing attacks has received a great interest from the biometric community. The goal of a morphing attack is to subvert the FRS at Automatic Border Control (ABC) gates by presenting the Electronic Machine Readable Travel Document (eMRTD) or e-passport that is obtained based on the morphed face image. Since the application process for the e-passport in the majority countries requires a passport photo to be presented by the applicant, a malicious actor and the accomplice can generate the morphed face image and to obtain the e-passport. An e-passport with a morphed face images can be used by both the malicious actor and the accomplice to cross the border as the morphed face image can be verified against both of them. This can result in a significant threat as a malicious actor can cross the border without revealing the track of his/her criminal background while the details of accomplice are recorded in the log of the access control system. This survey aims to present a systematic overview of the progress made in the area of face morphing in terms of both morph generation and morph detection. In this paper, we describe and illustrate various aspects of face morphing attacks, including different techniques for generating morphed face images but also the state-of-the-art regarding Morph Attack Detection (MAD) algorithms based on a stringent taxonomy and finally the availability of public databases, which allow to benchmark new MAD algorithms in a reproducible manner. The outcomes of competitions/benchmarking, vulnerability assessments and performance evaluation metrics are also provided in a comprehensive manner. Furthermore, we discuss the open challenges and potential future works that need to be addressed in this evolving field of biometrics.

Via

Access Paper or Ask Questions

FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding

Dec 05, 2020
Maryam Rahnemoonfar, Tashnim Chowdhury, Argho Sarkar, Debvrat Varshney, Masoud Yari, Robin Murphy

Figure 1 for FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding

Figure 2 for FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding

Figure 3 for FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding

Figure 4 for FloodNet: A High Resolution Aerial Imagery Dataset for Post Flood Scene Understanding

Visual scene understanding is the core task in making any crucial decision in any computer vision system. Although popular computer vision datasets like Cityscapes, MS-COCO, PASCAL provide good benchmarks for several tasks (e.g. image classification, segmentation, object detection), these datasets are hardly suitable for post disaster damage assessments. On the other hand, existing natural disaster datasets include mainly satellite imagery which have low spatial resolution and a high revisit period. Therefore, they do not have a scope to provide quick and efficient damage assessment tasks. Unmanned Aerial Vehicle(UAV) can effortlessly access difficult places during any disaster and collect high resolution imagery that is required for aforementioned tasks of computer vision. To address these issues we present a high resolution UAV imagery, FloodNet, captured after the hurricane Harvey. This dataset demonstrates the post flooded damages of the affected areas. The images are labeled pixel-wise for semantic segmentation task and questions are produced for the task of visual question answering. FloodNet poses several challenges including detection of flooded roads and buildings and distinguishing between natural water and flooded water. With the advancement of deep learning algorithms, we can analyze the impact of any disaster which can make a precise understanding of the affected areas. In this paper, we compare and contrast the performances of baseline methods for image classification, semantic segmentation, and visual question answering on our dataset.

* 11 pages

Via

Access Paper or Ask Questions