Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Andrew Markham

AtLoc: Attention Guided Camera Localization

Sep 08, 2019

Bing Wang, Changhao Chen, Chris Xiaoxuan Lu, Peijun Zhao, Niki Trigoni, Andrew Markham

Figure 1 for AtLoc: Attention Guided Camera Localization

Figure 2 for AtLoc: Attention Guided Camera Localization

Figure 3 for AtLoc: Attention Guided Camera Localization

Figure 4 for AtLoc: Attention Guided Camera Localization

Abstract:Deep learning has achieved impressive results in camera localization, but current single-image techniques typically suffer from a lack of robustness, leading to large outliers. To some extent, this has been tackled by sequential (multi-images) or geometry constraint approaches, which can learn to reject dynamic objects and illumination conditions to achieve better performance. In this work, we show that attention can be used to force the network to focus on more geometrically robust objects and features, achieving state-of-the-art performance in common benchmark, even if using only a single image as input. Extensive experimental evidence is provided through public indoor and outdoor datasets. Through visualization of the saliency maps, we demonstrate how the network learns to reject dynamic objects, yielding superior global camera pose regression performance. The source code is avaliable at https://github.com/BingCS/AtLoc.

Via

Access Paper or Ask Questions

Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues

Aug 14, 2019

Chris Xiaoxuan Lu, Xuan Kan, Bowen Du, Changhao Chen, Hongkai Wen, Andrew Markham, Niki Trigoni, John Stankovic

Figure 1 for Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues

Figure 2 for Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues

Figure 3 for Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues

Figure 4 for Autonomous Learning for Face Recognition in the Wild via Ambient Wireless Cues

Abstract:Facial recognition is a key enabling component for emerging Internet of Things (IoT) services such as smart homes or responsive offices. Through the use of deep neural networks, facial recognition has achieved excellent performance. However, this is only possibly when trained with hundreds of images of each user in different viewing and lighting conditions. Clearly, this level of effort in enrolment and labelling is impossible for wide-spread deployment and adoption. Inspired by the fact that most people carry smart wireless devices with them, e.g. smartphones, we propose to use this wireless identifier as a supervisory label. This allows us to curate a dataset of facial images that are unique to a certain domain e.g. a set of people in a particular office. This custom corpus can then be used to finetune existing pre-trained models e.g. FaceNet. However, due to the vagaries of wireless propagation in buildings, the supervisory labels are noisy and weak.We propose a novel technique, AutoTune, which learns and refines the association between a face and wireless identifier over time, by increasing the inter-cluster separation and minimizing the intra-cluster distance. Through extensive experiments with multiple users on two sites, we demonstrate the ability of AutoTune to design an environment-specific, continually evolving facial recognition system with entirely no user effort.

* 11 pages, accepted in the Web Conference (WWW'2019)

Via

Access Paper or Ask Questions

DynaNet: Neural Kalman Dynamical Model for Motion Estimation and Prediction

Aug 11, 2019

Changhao Chen, Chris Xiaoxuan Lu, Bing Wang, Niki Trigoni, Andrew Markham

Figure 1 for DynaNet: Neural Kalman Dynamical Model for Motion Estimation and Prediction

Figure 2 for DynaNet: Neural Kalman Dynamical Model for Motion Estimation and Prediction

Figure 3 for DynaNet: Neural Kalman Dynamical Model for Motion Estimation and Prediction

Figure 4 for DynaNet: Neural Kalman Dynamical Model for Motion Estimation and Prediction

Abstract:Dynamical models estimate and predict the temporal evolution of physical systems. State Space Models (SSMs) in particular represent the system dynamics with many desirable properties, such as being able to model uncertainty in both the model and measurements, and optimal (in the Bayesian sense) recursive formulations e.g. the Kalman Filter. However, they require significant domain knowledge to derive the parametric form and considerable hand-tuning to correctly set all the parameters. Data driven techniques e.g. Recurrent Neural Networks have emerged as compelling alternatives to SSMs with wide success across a number of challenging tasks, in part due to their ability to extract relevant features from rich inputs. They however lack interpretability and robustness to unseen conditions. In this work, we present DynaNet, a hybrid deep learning and time-varying state-space model which can be trained end-to-end. Our neural Kalman dynamical model allows us to exploit the relative merits of each approach. We demonstrate state-of-the-art estimation and prediction on a number of physically challenging tasks, including visual odometry, sensor fusion for visual-inertial navigation and pendulum control. In addition we show how DynaNet can indicate failures through investigation of properties such as the rate of innovation (Kalman Gain).

Via

Access Paper or Ask Questions

Distilling Knowledge From a Deep Pose Regressor Network

Aug 02, 2019

Muhamad Risqi U. Saputra, Pedro P. B. de Gusmao, Yasin Almalioglu, Andrew Markham, Niki Trigoni

Figure 1 for Distilling Knowledge From a Deep Pose Regressor Network

Figure 2 for Distilling Knowledge From a Deep Pose Regressor Network

Figure 3 for Distilling Knowledge From a Deep Pose Regressor Network

Figure 4 for Distilling Knowledge From a Deep Pose Regressor Network

Abstract:This paper presents a novel method to distill knowledge from a deep pose regressor network for efficient Visual Odometry (VO). Standard distillation relies on "dark knowledge" for successful knowledge transfer. As this knowledge is not available in pose regression and the teacher prediction is not always accurate, we propose to emphasize the knowledge transfer only when we trust the teacher. We achieve this by using teacher loss as a confidence score which places variable relative importance on the teacher prediction. We inject this confidence score to the main training task via Attentive Imitation Loss (AIL) and when learning the intermediate representation of the teacher through Attentive Hint Training (AHT) approach. To the best of our knowledge, this is the first work which successfully distill the knowledge from a deep pose regression network. Our evaluation on the KITTI and Malaga dataset shows that we can keep the student prediction close to the teacher with up to 92.95% parameter reduction and 2.12x faster in computation time.

* Accepted to ICCV 2019

Via

Access Paper or Ask Questions

Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

Jun 04, 2019

Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni

Figure 1 for Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

Figure 2 for Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

Figure 3 for Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

Figure 4 for Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

Abstract:We propose a novel, conceptually simple and general framework for instance segmentation on 3D point clouds. Our method, called 3D-BoNet, follows the simple design philosophy of per-point multilayer perceptrons (MLPs). The framework directly regresses 3D bounding boxes for all instances in a point cloud, while simultaneously predicting a point-level mask for each instance. It consists of a backbone network followed by two parallel network branches for 1) bounding box regression and 2) point mask prediction. 3D-BoNet is single-stage, anchor-free and end-to-end trainable. Moreover, it is remarkably computationally efficient as, unlike existing approaches, it does not require any post-processing steps such as non-maximum suppression, feature sampling, clustering or voting. Extensive experiments show that our approach surpasses existing work on both ScanNet and S3DIS datasets while being approximately 10x more computationally efficient. Comprehensive ablation studies demonstrate the effectiveness of our design.

* Tech Report. Code and data are available at https://github.com/Yang7879/3D-BoNet

Via

Access Paper or Ask Questions

Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning

Mar 25, 2019

Muhamad Risqi U. Saputra, Pedro P. B. de Gusmao, Sen Wang, Andrew Markham, Niki Trigoni

Figure 1 for Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning

Figure 2 for Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning

Figure 3 for Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning

Figure 4 for Learning Monocular Visual Odometry through Geometry-Aware Curriculum Learning

Abstract:Inspired by the cognitive process of humans and animals, Curriculum Learning (CL) trains a model by gradually increasing the difficulty of the training data. In this paper, we study whether CL can be applied to complex geometry problems like estimating monocular Visual Odometry (VO). Unlike existing CL approaches, we present a novel CL strategy for learning the geometry of monocular VO by gradually making the learning objective more difficult during training. To this end, we propose a novel geometry-aware objective function by jointly optimizing relative and composite transformations over small windows via bounded pose regression loss. A cascade optical flow network followed by recurrent network with a differentiable windowed composition layer, termed CL-VO, is devised to learn the proposed objective. Evaluation on three real-world datasets shows superior performance of CL-VO over state-of-the-art feature-based and learning-based VO.

* accepted in ICRA 2019

Via

Access Paper or Ask Questions

Selective Sensor Fusion for Neural Visual-Inertial Odometry

Mar 04, 2019

Changhao Chen, Stefano Rosa, Yishu Miao, Chris Xiaoxuan Lu, Wei Wu, Andrew Markham, Niki Trigoni

Figure 1 for Selective Sensor Fusion for Neural Visual-Inertial Odometry

Figure 2 for Selective Sensor Fusion for Neural Visual-Inertial Odometry

Figure 3 for Selective Sensor Fusion for Neural Visual-Inertial Odometry

Figure 4 for Selective Sensor Fusion for Neural Visual-Inertial Odometry

Abstract:Deep learning approaches for Visual-Inertial Odometry (VIO) have proven successful, but they rarely focus on incorporating robust fusion strategies for dealing with imperfect input sensory data. We propose a novel end-to-end selective sensor fusion framework for monocular VIO, which fuses monocular images and inertial measurements in order to estimate the trajectory whilst improving robustness to real-life issues, such as missing and corrupted data or bad sensor synchronization. In particular, we propose two fusion modalities based on different masking strategies: deterministic soft fusion and stochastic hard fusion, and we compare with previously proposed direct fusion baselines. During testing, the network is able to selectively process the features of the available sensor modalities and produce a trajectory at scale. We present a thorough investigation on the performances on three public autonomous driving, Micro Aerial Vehicle (MAV) and hand-held VIO datasets. The results demonstrate the effectiveness of the fusion strategies, which offer better performances compared to direct fusion, particularly in presence of corrupted data. In addition, we study the interpretability of the fusion networks by visualising the masking layers in different scenarios and with varying data corruption, revealing interesting correlations between the fusion networks and imperfect sensory input data.

* Accepted by CVPR 2019

Via

Access Paper or Ask Questions

Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning

Dec 12, 2018

Linhai Xie, Sen Wang, Stefano Rosa, Andrew Markham, Niki Trigoni

Figure 1 for Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning

Figure 2 for Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning

Figure 3 for Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning

Figure 4 for Learning with Training Wheels: Speeding up Training with a Simple Controller for Deep Reinforcement Learning

Abstract:Deep Reinforcement Learning (DRL) has been applied successfully to many robotic applications. However, the large number of trials needed for training is a key issue. Most of existing techniques developed to improve training efficiency (e.g. imitation) target on general tasks rather than being tailored for robot applications, which have their specific context to benefit from. We propose a novel framework, Assisted Reinforcement Learning, where a classical controller (e.g. a PID controller) is used as an alternative, switchable policy to speed up training of DRL for local planning and navigation problems. The core idea is that the simple control law allows the robot to rapidly learn sensible primitives, like driving in a straight line, instead of random exploration. As the actor network becomes more advanced, it can then take over to perform more complex actions, like obstacle avoidance. Eventually, the simple controller can be discarded entirely. We show that not only does this technique train faster, it also is less sensitive to the structure of the DRL network and consistently outperforms a standard Deep Deterministic Policy Gradient network. We demonstrate the results in both simulation and real-world experiments.

* Published in ICRA2018. The code is now available at https://github.com/xie9187/AsDDPG

Via

Access Paper or Ask Questions

Learning with Stochastic Guidance for Navigation

Nov 27, 2018

Linhai Xie, Yishu Miao, Sen Wang, Phil Blunsom, Zhihua Wang, Changhao Chen, Andrew Markham, Niki Trigoni

Figure 1 for Learning with Stochastic Guidance for Navigation

Figure 2 for Learning with Stochastic Guidance for Navigation

Figure 3 for Learning with Stochastic Guidance for Navigation

Figure 4 for Learning with Stochastic Guidance for Navigation

Abstract:Due to the sparse rewards and high degree of environment variation, reinforcement learning approaches such as Deep Deterministic Policy Gradient (DDPG) are plagued by issues of high variance when applied in complex real world environments. We present a new framework for overcoming these issues by incorporating a stochastic switch, allowing an agent to choose between high and low variance policies. The stochastic switch can be jointly trained with the original DDPG in the same framework. In this paper, we demonstrate the power of the framework in a navigation task, where the robot can dynamically choose to learn through exploration, or to use the output of a heuristic controller as guidance. Instead of starting from completely random moves, the navigation capability of a robot can be quickly bootstrapped by several simple independent controllers. The experimental results show that with the aid of stochastic guidance we are able to effectively and efficiently train DDPG navigation policies and achieve significantly better performance than state-of-the-art baselines models.

* A short version is accepted by the NIPS 2018 workshop: Infer2Control

Via

Access Paper or Ask Questions

3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations

Oct 24, 2018

Zhihua Wang, Stefano Rosa, Bo Yang, Sen Wang, Niki Trigoni, Andrew Markham

Figure 1 for 3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations

Figure 2 for 3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations

Figure 3 for 3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations

Figure 4 for 3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations

Abstract:The ability to interact and understand the environment is a fundamental prerequisite for a wide range of applications from robotics to augmented reality. In particular, predicting how deformable objects will react to applied forces in real time is a significant challenge. This is further confounded by the fact that shape information about encountered objects in the real world is often impaired by occlusions, noise and missing regions e.g. a robot manipulating an object will only be able to observe a partial view of the entire solid. In this work we present a framework, 3D-PhysNet, which is able to predict how a three-dimensional solid will deform under an applied force using intuitive physics modelling. In particular, we propose a new method to encode the physical properties of the material and the applied force, enabling generalisation over materials. The key is to combine deep variational autoencoders with adversarial training, conditioned on the applied force and the material properties. We further propose a cascaded architecture that takes a single 2.5D depth view of the object and predicts its deformation. Training data is provided by a physics simulator. The network is fast enough to be used in real-time applications from partial views. Experimental results show the viability and the generalisation properties of the proposed architecture.

* in IJCAI 2018

Via

Access Paper or Ask Questions