Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Neil Smith

Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing

Apr 18, 2019

Matthias Müller, Guohao Li, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem

Figure 1 for Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing

Figure 2 for Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing

Figure 3 for Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing

Figure 4 for Learning a Controller Fusion Network by Online Trajectory Filtering for Vision-based UAV Racing

Abstract:Autonomous UAV racing has recently emerged as an interesting research problem. The dream is to beat humans in this new fast-paced sport. A common approach is to learn an end-to-end policy that directly predicts controls from raw images by imitating an expert. However, such a policy is limited by the expert it imitates and scaling to other environments and vehicle dynamics is difficult. One approach to overcome the drawbacks of an end-to-end policy is to train a network only on the perception task and handle control with a PID or MPC controller. However, a single controller must be extensively tuned and cannot usually cover the whole state space. In this paper, we propose learning an optimized controller using a DNN that fuses multiple controllers. The network learns a robust controller with online trajectory filtering, which suppresses noisy trajectories and imperfections of individual controllers. The result is a network that is able to learn a good fusion of filtered trajectories from different controllers leading to significant improvements in overall performance. We compare our trained network to controllers it has learned from, end-to-end baselines and human pilots in a realistic simulation; our network beats all baselines in extensive experiments and approaches the performance of a professional human pilot. A video summarizing this work is available at https://youtu.be/hGKlE5X9Z5U

* Accepted at CVPRW'19: UAVision 2019. First two authors contributed equally. Based on the initial work of arXiv:1803.01129 which was eventually split into two separate projects

Via

Access Paper or Ask Questions

Latent Filter Scaling for Multimodal Unsupervised Image-to-Image Translation

Dec 24, 2018

Yazeed Alharbi, Neil Smith, Peter Wonka

Figure 1 for Latent Filter Scaling for Multimodal Unsupervised Image-to-Image Translation

Figure 2 for Latent Filter Scaling for Multimodal Unsupervised Image-to-Image Translation

Figure 3 for Latent Filter Scaling for Multimodal Unsupervised Image-to-Image Translation

Figure 4 for Latent Filter Scaling for Multimodal Unsupervised Image-to-Image Translation

Abstract:In multimodal unsupervised image-to-image translation tasks, the goal is to translate an image from the source domain to many images in the target domain. We present a simple method that produces higher quality images than current state-of-the-art while maintaining the same amount of multimodal diversity. Previous methods follow the unconditional approach of trying to map the latent code directly to a full-size image. This leads to complicated network architectures with several introduced hyperparameters to tune. By treating the latent code as a modifier of the convolutional filters, we produce multimodal output while maintaining the traditional Generative Adversarial Network (GAN) loss and without additional hyperparameters. The only tuning required by our method controls the tradeoff between variability and quality of generated images. Furthermore, we achieve disentanglement between source domain content and target domain style for free as a by-product of our formulation. We perform qualitative and quantitative experiments showing the advantages of our method compared with the state-of-the art on multiple benchmark image-to-image translation datasets.

Via

Access Paper or Ask Questions

OIL: Observational Imitation Learning

Nov 27, 2018

Guohao Li, Matthias Müller, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem

Figure 1 for OIL: Observational Imitation Learning

Figure 2 for OIL: Observational Imitation Learning

Figure 3 for OIL: Observational Imitation Learning

Figure 4 for OIL: Observational Imitation Learning

Abstract:Recent work has explored the problem of autonomous navigation by imitating a teacher and learning an end-to-end policy, which directly predicts controls from raw images. However, these approaches tend to be sensitive to mistakes by the teacher and do not scale well to other environments or vehicles. To this end, we propose Observational Imitation Learning (OIL), a novel imitation learning variant that supports online training and automatic selection of optimal behavior by observing multiple imperfect teachers. We apply our proposed methodology to the challenging problems of autonomous driving and UAV racing. For both tasks, we utilize the Sim4CV simulator that enables the generation of large amounts of synthetic training data and also allows for online learning and evaluation. We train a perception network to predict waypoints from raw image data and use OIL to train another network to predict controls from these waypoints. Extensive experiments demonstrate that our trained network outperforms its teachers, conventional imitation learning (IL) and reinforcement learning (RL) baselines and even humans in simulation. The project website is available at https://sites.google.com/kaust.edu.sa/oil/

* Preprint

Via

Access Paper or Ask Questions

Teaching UAVs to Race Using Sim4CV

Mar 24, 2018

Matthias Müller, Vincent Casser, Neil Smith, Dominik L. Michels, Bernard Ghanem

Figure 1 for Teaching UAVs to Race Using Sim4CV

Figure 2 for Teaching UAVs to Race Using Sim4CV

Figure 3 for Teaching UAVs to Race Using Sim4CV

Figure 4 for Teaching UAVs to Race Using Sim4CV

Abstract:Automating the navigation of unmanned aerial vehicles (UAVs) in diverse scenarios has gained much attention in the recent years. However, teaching UAVs to fly in challenging environments remains an unsolved problem, mainly due to the lack of data for training. In this paper, we develop a photo-realistic simulator that can afford the generation of large amounts of training data (both images rendered from the UAV camera and its controls) to teach a UAV to autonomously race through challenging tracks. We train a deep neural network to predict UAV controls from raw image data for the task of autonomous UAV racing. Training is done through imitation learning enabled by data augmentation to allow for the correction of navigation mistakes. Extensive experiments demonstrate that our trained network (when sufficient data augmentation is used) outperforms state-of-the-art methods and flies more consistently than many human pilots.

* preprint

Via

Access Paper or Ask Questions

Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications

Mar 24, 2018

Matthias Müller, Vincent Casser, Jean Lahoud, Neil Smith, Bernard Ghanem

Figure 1 for Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications

Figure 2 for Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications

Figure 3 for Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications

Figure 4 for Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications

Abstract:We present a photo-realistic training and evaluation simulator (Sim4CV) with extensive applications across various fields of computer vision. Built on top of the Unreal Engine, the simulator integrates full featured physics based cars, unmanned aerial vehicles (UAVs), and animated human actors in diverse urban and suburban 3D environments. We demonstrate the versatility of the simulator with two case studies: autonomous UAV-based tracking of moving objects and autonomous driving using supervised learning. The simulator fully integrates both several state-of-the-art tracking algorithms with a benchmark evaluation tool and a deep neural network (DNN) architecture for training vehicles to drive autonomously. It generates synthetic photo-realistic datasets with automatic ground truth annotations to easily extend existing real-world datasets and provides extensive synthetic data variety through its ability to reconfigure synthetic worlds on the fly using an automatic world generation tool. The supplementary video can be viewed a https://youtu.be/SqAxzsQ7qUU

* Published at the International Journal of Computer Vision (IJCV), 2018

Via

Access Paper or Ask Questions