Abstract:Traffic light recognition, as a critical component of the perception module of self-driving vehicles, plays a vital role in the intelligent transportation systems. The prevalent deep learning based traffic light recognition methods heavily hinge on the large quantity and rich diversity of training data. However, it is quite challenging to collect data in various rare scenarios such as flashing, blackout or extreme weather, thus resulting in the imbalanced distribution of training data and consequently the degraded performance in recognizing rare classes. In this paper, we seek to improve traffic light recognition by leveraging data synthesis. Inspired by the generative adversarial networks (GANs), we propose a novel traffic light generation approach TL-GAN to synthesize the data of rare classes to improve traffic light recognition for autonomous driving. TL-GAN disentangles traffic light sequence generation into image synthesis and sequence assembling. In the image synthesis stage, our approach enables conditional generation to allow full control of the color of the generated traffic light images. In the sequence assembling stage, we design the style mixing and adaptive template to synthesize realistic and diverse traffic light sequences. Extensive experiments show that the proposed TL-GAN renders remarkable improvement over the baseline without using the generated data, leading to the state-of-the-art performance in comparison with the competing algorithms that are used for general image synthesis and data imbalance tackling.
Abstract:In deep learning-based local stereo matching methods, larger image patches usually bring better stereo matching accuracy. However, it is unrealistic to increase the size of the image patch size without restriction. Arbitrarily extending the patch size will change the local stereo matching method into the global stereo matching method, and the matching accuracy will be saturated. We simplified the existing Siamese convolutional network by reducing the number of network parameters and propose an efficient CNN based structure, namely Adaptive Deconvolution-based disparity matching Net (ADSM net) by adding deconvolution layers to learn how to enlarge the size of input feature map for the following convolution layers. Experimental results on the KITTI 2012 and 2015 datasets demonstrate that the proposed method can achieve a good trade-off between accuracy and complexity.