Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qin Huang

Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Dec 13, 2018

Ye Wang, Jongmoo Choi, Yueru Chen, Qin Huang, Siyang Li, Ming-Sui Lee, C. -C. Jay Kuo

Figure 1 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Figure 2 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Figure 3 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Figure 4 for Design Pseudo Ground Truth with Motion Cue for Unsupervised Video Object Segmentation

Abstract:One major technique debt in video object segmentation is to label the object masks for training instances. As a result, we propose to prepare inexpensive, yet high quality pseudo ground truth corrected with motion cue for video object segmentation training. Our method conducts semantic segmentation using instance segmentation networks and, then, selects the segmented object of interest as the pseudo ground truth based on the motion information. Afterwards, the pseudo ground truth is exploited to finetune the pretrained objectness network to facilitate object segmentation in the remaining frames of the video. We show that the pseudo ground truth could effectively improve the segmentation performance. This straightforward unsupervised video object segmentation method is more efficient than existing methods. Experimental results on DAVIS and FBMS show that the proposed method outperforms state-of-the-art unsupervised segmentation methods on various benchmark datasets. And the category-agnostic pseudo ground truth has great potential to extend to multiple arbitrary object tracking.

* 16 pages, 7 figures, 6 tables, conference

Via

Access Paper or Ask Questions

SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting

Aug 06, 2018

Yuhang Song, Chao Yang, Yeji Shen, Peng Wang, Qin Huang, C. -C. Jay Kuo

Figure 1 for SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting

Figure 2 for SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting

Figure 3 for SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting

Figure 4 for SPG-Net: Segmentation Prediction and Guidance Network for Image Inpainting

Abstract:In this paper, we focus on image inpainting task, aiming at recovering the missing area of an incomplete image given the context information. Recent development in deep generative models enables an efficient end-to-end framework for image synthesis and inpainting tasks, but existing methods based on generative models don't exploit the segmentation information to constrain the object shapes, which usually lead to blurry results on the boundary. To tackle this problem, we propose to introduce the semantic segmentation information, which disentangles the inter-class difference and intra-class variation for image inpainting. This leads to much clearer recovered boundary between semantically different regions and better texture within semantically consistent segments. Our model factorizes the image inpainting process into segmentation prediction (SP-Net) and segmentation guidance (SG-Net) as two steps, which predict the segmentation labels in the missing area first, and then generate segmentation guided inpainting results. Experiments on multiple public datasets show that our approach outperforms existing methods in optimizing the image inpainting quality, and the interactive segmentation guidance provides possibilities for multi-modal predictions of image inpainting.

* BMVC 2018 camera ready

Via

Access Paper or Ask Questions

Contextual-based Image Inpainting: Infer, Match, and Translate

Jul 25, 2018

Yuhang Song, Chao Yang, Zhe Lin, Xiaofeng Liu, Qin Huang, Hao Li, C. -C. Jay Kuo

Figure 1 for Contextual-based Image Inpainting: Infer, Match, and Translate

Figure 2 for Contextual-based Image Inpainting: Infer, Match, and Translate

Figure 3 for Contextual-based Image Inpainting: Infer, Match, and Translate

Figure 4 for Contextual-based Image Inpainting: Infer, Match, and Translate

Abstract:We study the task of image inpainting, which is to fill in the missing region of an incomplete image with plausible contents. To this end, we propose a learning-based approach to generate visually coherent completion given a high-resolution image with missing components. In order to overcome the difficulty to directly learn the distribution of high-dimensional image data, we divide the task into inference and translation as two separate steps and model each step with a deep neural network. We also use simple heuristics to guide the propagation of local textures from the boundary to the hole. We show that, by using such techniques, inpainting reduces to the problem of learning two image-feature translation functions in much smaller space and hence easier to train. We evaluate our method on several public datasets and show that we generate results of better visual quality than previous state-of-the-art methods.

* ECCV 2018 camera ready

Via

Access Paper or Ask Questions

Instance Embedding Transfer to Unsupervised Video Object Segmentation

Feb 27, 2018

Siyang Li, Bryan Seybold, Alexey Vorobyov, Alireza Fathi, Qin Huang, C. -C. Jay Kuo

Figure 1 for Instance Embedding Transfer to Unsupervised Video Object Segmentation

Figure 2 for Instance Embedding Transfer to Unsupervised Video Object Segmentation

Figure 3 for Instance Embedding Transfer to Unsupervised Video Object Segmentation

Figure 4 for Instance Embedding Transfer to Unsupervised Video Object Segmentation

Abstract:We propose a method for unsupervised video object segmentation by transferring the knowledge encapsulated in image-based instance embedding networks. The instance embedding network produces an embedding vector for each pixel that enables identifying all pixels belonging to the same object. Though trained on static images, the instance embeddings are stable over consecutive video frames, which allows us to link objects together over time. Thus, we adapt the instance networks trained on static images to video object segmentation and incorporate the embeddings with objectness and optical flow features, without model retraining or online fine-tuning. The proposed method outperforms state-of-the-art unsupervised segmentation methods in the DAVIS dataset and the FBMS dataset.

* To appear in CVPR 2018

Via

Access Paper or Ask Questions

Multiple Instance Curriculum Learning for Weakly Supervised Object Detection

Nov 25, 2017

Siyang Li, Xiangxin Zhu, Qin Huang, Hao Xu, C. -C. Jay Kuo

Figure 1 for Multiple Instance Curriculum Learning for Weakly Supervised Object Detection

Figure 2 for Multiple Instance Curriculum Learning for Weakly Supervised Object Detection

Figure 3 for Multiple Instance Curriculum Learning for Weakly Supervised Object Detection

Figure 4 for Multiple Instance Curriculum Learning for Weakly Supervised Object Detection

Abstract:When supervising an object detector with weakly labeled data, most existing approaches are prone to trapping in the discriminative object parts, e.g., finding the face of a cat instead of the full body, due to lacking the supervision on the extent of full objects. To address this challenge, we incorporate object segmentation into the detector training, which guides the model to correctly localize the full objects. We propose the multiple instance curriculum learning (MICL) method, which injects curriculum learning (CL) into the multiple instance learning (MIL) framework. The MICL method starts by automatically picking the easy training examples, where the extent of the segmentation masks agree with detection bounding boxes. The training set is gradually expanded to include harder examples to train strong detectors that handle complex images. The proposed MICL method with segmentation in the loop outperforms the state-of-the-art weakly supervised object detectors by a substantial margin on the PASCAL VOC datasets.

* Published in BMVC 2017

Via

Access Paper or Ask Questions

A Taught-Obesrve-Ask Method for Object Detection with Critical Supervision

Nov 03, 2017

Chi-Hao Wu, Qin Huang, Siyang Li, C. -C. Jay Kuo

Figure 1 for A Taught-Obesrve-Ask Method for Object Detection with Critical Supervision

Figure 2 for A Taught-Obesrve-Ask Method for Object Detection with Critical Supervision

Figure 3 for A Taught-Obesrve-Ask Method for Object Detection with Critical Supervision

Figure 4 for A Taught-Obesrve-Ask Method for Object Detection with Critical Supervision

Abstract:Being inspired by child's learning experience - taught first and followed by observation and questioning, we investigate a critically supervised learning methodology for object detection in this work. Specifically, we propose a taught-observe-ask (TOA) method that consists of several novel components such as negative object proposal, critical example mining, and machine-guided question-answer (QA) labeling. To consider labeling time and performance jointly, new evaluation methods are developed to compare the performance of the TOA method, with the fully and weakly supervised learning methods. Extensive experiments are conducted on the PASCAL VOC and the Caltech benchmark datasets. The TOA method provides significantly improved performance of weakly supervision yet demands only about 3-6% of labeling time of full supervision. The effectiveness of each novel component is also analyzed.

Via

Access Paper or Ask Questions

Semantic Segmentation with Reverse Attention

Jul 20, 2017

Qin Huang, Chunyang Xia, Chihao Wu, Siyang Li, Ye Wang, Yuhang Song, C. -C. Jay Kuo

Figure 1 for Semantic Segmentation with Reverse Attention

Figure 2 for Semantic Segmentation with Reverse Attention

Figure 3 for Semantic Segmentation with Reverse Attention

Figure 4 for Semantic Segmentation with Reverse Attention

Abstract:Recent development in fully convolutional neural network enables efficient end-to-end learning of semantic segmentation. Traditionally, the convolutional classifiers are taught to learn the representative semantic features of labeled semantic objects. In this work, we propose a reverse attention network (RAN) architecture that trains the network to capture the opposite concept (i.e., what are not associated with a target class) as well. The RAN is a three-branch network that performs the direct, reverse and reverse-attention learning processes simultaneously. Extensive experiments are conducted to show the effectiveness of the RAN in semantic segmentation. Being built upon the DeepLabv2-LargeFOV, the RAN achieves the state-of-the-art mIoU score (48.1%) for the challenging PASCAL-Context dataset. Significant performance improvements are also observed for the PASCAL-VOC, Person-Part, NYUDv2 and ADE20K datasets.

* accepted for oral presentation in BMVC 2017

Via

Access Paper or Ask Questions

Object Boundary Guided Semantic Segmentation

Jul 06, 2016

Qin Huang, Chunyang Xia, Wenchao Zheng, Yuhang Song, Hao Xu, C. -C. Jay Kuo

Figure 1 for Object Boundary Guided Semantic Segmentation

Figure 2 for Object Boundary Guided Semantic Segmentation

Figure 3 for Object Boundary Guided Semantic Segmentation

Figure 4 for Object Boundary Guided Semantic Segmentation

Abstract:Semantic segmentation is critical to image content understanding and object localization. Recent development in fully-convolutional neural network (FCN) has enabled accurate pixel-level labeling. One issue in previous works is that the FCN based method does not exploit the object boundary information to delineate segmentation details since the object boundary label is ignored in the network training. To tackle this problem, we introduce a double branch fully convolutional neural network, which separates the learning of the desirable semantic class labeling with mask-level object proposals guided by relabeled boundaries. This network, called object boundary guided FCN (OBG-FCN), is able to integrate the distinct properties of object shape and class features elegantly in a fully convolutional way with a designed masking architecture. We conduct experiments on the PASCAL VOC segmentation benchmark, and show that the end-to-end trainable OBG-FCN system offers great improvement in optimizing the target semantic segmentation quality.

* The results in the first version of this paper are mistaken due to overlapping validation data and incorrect benchmark methods

Via

Access Paper or Ask Questions