Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew Brown

Pose2Instance: Harnessing Keypoints for Person Instance Segmentation

Apr 04, 2017

Subarna Tripathi, Maxwell Collins, Matthew Brown, Serge Belongie

Figure 1 for Pose2Instance: Harnessing Keypoints for Person Instance Segmentation

Figure 2 for Pose2Instance: Harnessing Keypoints for Person Instance Segmentation

Figure 3 for Pose2Instance: Harnessing Keypoints for Person Instance Segmentation

Figure 4 for Pose2Instance: Harnessing Keypoints for Person Instance Segmentation

Abstract:Human keypoints are a well-studied representation of people.We explore how to use keypoint models to improve instance-level person segmentation. The main idea is to harness the notion of a distance transform of oracle provided keypoints or estimated keypoint heatmaps as a prior for person instance segmentation task within a deep neural network. For training and evaluation, we consider all those images from COCO where both instance segmentation and human keypoints annotations are available. We first show how oracle keypoints can boost the performance of existing human segmentation model during inference without any training. Next, we propose a framework to directly learn a deep instance segmentation model conditioned on human pose. Experimental results show that at various Intersection Over Union (IOU) thresholds, in a constrained environment with oracle keypoints, the instance segmentation accuracy achieves 10% to 12% relative improvements over a strong baseline of oracle bounding boxes. In a more realistic environment, without the oracle keypoints, the proposed deep person instance segmentation model conditioned on human pose achieves 3.8% to 10.5% relative improvements comparing with its strongest baseline of a deep network trained only for segmentation.

Via

Access Paper or Ask Questions

Nonrigid Optical Flow Ground Truth for Real-World Scenes with Time-Varying Shading Effects

Jul 15, 2016

Wenbin Li, Darren Cosker, Zhihan Lv, Matthew Brown

Abstract:In this paper we present a dense ground truth dataset of nonrigidly deforming real-world scenes. Our dataset contains both long and short video sequences, and enables the quantitatively evaluation for RGB based tracking and registration methods. To construct ground truth for the RGB sequences, we simultaneously capture Near-Infrared (NIR) image sequences where dense markers - visible only in NIR - represent ground truth positions. This allows for comparison with automatically tracked RGB positions and the formation of error metrics. Most previous datasets containing nonrigidly deforming sequences are based on synthetic data. Our capture protocol enables us to acquire real-world deforming objects with realistic photometric effects - such as blur and illumination change - as well as occlusion and complex deformations. A public evaluation website is constructed to allow for ranking of RGB image based optical flow and other dense tracking algorithms, with various statistical measures. Furthermore, we present an RGB-NIR multispectral optical flow model allowing for energy optimization by adoptively combining featured information from both the RGB and the complementary NIR channels. In our experiments we evaluate eight existing RGB based optical flow methods on our new dataset. We also evaluate our hybrid optical flow algorithm by comparing to two existing multispectral approaches, as well as varying our input channels across RGB, NIR and RGB-NIR.

* preprint of our paper accepted by RA-L'16

Via

Access Paper or Ask Questions

Drift Robust Non-rigid Optical Flow Enhancement for Long Sequences

Mar 07, 2016

Wenbin Li, Darren Cosker, Matthew Brown

Figure 1 for Drift Robust Non-rigid Optical Flow Enhancement for Long Sequences

Figure 2 for Drift Robust Non-rigid Optical Flow Enhancement for Long Sequences

Figure 3 for Drift Robust Non-rigid Optical Flow Enhancement for Long Sequences

Figure 4 for Drift Robust Non-rigid Optical Flow Enhancement for Long Sequences

Abstract:It is hard to densely track a nonrigid object in long term, which is a fundamental research issue in the computer vision community. This task often relies on estimating pairwise correspondences between images over time where the error is accumulated and leads to a drift issue. In this paper, we introduce a novel optimization framework with an Anchor Patch constraint. It is supposed to significantly reduce overall errors given long sequences containing non-rigidly deformable objects. Our framework can be applied to any dense tracking algorithm, e.g. optical flow. We demonstrate the success of our approach by showing significant error reduction on 6 popular optical flow algorithms applied to a range of real-world nonrigid benchmarks. We also provide quantitative analysis of our approach given synthetic occlusions and image noise.

* Preprint version of our paper accepted by Journal of Intelligent and Fuzzy Systems

Via

Access Paper or Ask Questions

Decision Forests, Convolutional Networks and the Models in-Between

Mar 03, 2016

Yani Ioannou, Duncan Robertson, Darko Zikic, Peter Kontschieder, Jamie Shotton, Matthew Brown, Antonio Criminisi

Figure 1 for Decision Forests, Convolutional Networks and the Models in-Between

Figure 2 for Decision Forests, Convolutional Networks and the Models in-Between

Figure 3 for Decision Forests, Convolutional Networks and the Models in-Between

Figure 4 for Decision Forests, Convolutional Networks and the Models in-Between

Abstract:This paper investigates the connections between two state of the art classifiers: decision forests (DFs, including decision jungles) and convolutional neural networks (CNNs). Decision forests are computationally efficient thanks to their conditional computation property (computation is confined to only a small region of the tree, the nodes along a single branch). CNNs achieve state of the art accuracy, thanks to their representation learning capabilities. We present a systematic analysis of how to fuse conditional computation with representation learning and achieve a continuum of hybrid models with different ratios of accuracy vs. efficiency. We call this new family of hybrid models conditional networks. Conditional networks can be thought of as: i) decision trees augmented with data transformation operators, or ii) CNNs, with block-diagonal sparse weight matrices, and explicit data routing functions. Experimental validation is performed on the common task of image classification on both the CIFAR and Imagenet datasets. Compared to state of the art CNNs, our hybrid models yield the same accuracy with a fraction of the compute cost and much smaller number of parameters.

* Microsoft Research Technical Report

Via

Access Paper or Ask Questions