Temporal action proposal generation, coming from temporal action recognition, is an important and challenging problem in computer vision. Because of the big capacity of video files, the speed of temporal action recognition is difficult for both researchers and companies. To training a convolutional neural network (CNN) for temporal action recognition, a lot of videos are required to put into the CNN. A speed-up for the task should be proposed for the training process to achieve the faster response of temporal action recognition system. To address it, we implement ring parallel architecture by Massage Passing Interface (MPI). Different from traditional parameter server architecture, total data transmission is reduced by adding a connection between multiple computing load in our new architecture. Compared to parameter server architecture, our parallel architecture has higher efficiency on temporal action proposal generation task with multiple GPUs, which is significant to dealing with large-scale video database. And based on the absence of evaluating time consumption in a distributed deep learning system, we proposed a concept of training time metrics which can assess the performance in the distributed training process.
Temporal action proposal generation is an important and challenging problem in computer vision. The biggest challenge for the task is generating proposals with precise temporal boundaries. To address these difficulties, we improved the algorithm based on boundary sensitive network. The popular temporal convolution network today overlooked the original meaning of the single video feature vector. We proposed a new temporal convolution network called Multipath Temporal ConvNet (MTN), which consists of two parts i.e. Multipath DenseNet and SE-ConvNet, can extract more useful information from the video database. Besides, to respond to the large memory occupation and a large number of videos, we abandon traditional parameter server parallel architecture and introduce high performance computing into temporal action proposal generation. To achieve this, we implement ring parallel architecture by Massage Passing Interface (MPI) acting on our method. Compared to parameter server architecture, our parallel architecture has higher efficiency on temporal action detection task with multiple GPUs, which is significant to dealing with large-scale video database. We conduct experiments on ActivityNet-1.3 and THUMOS14, where our method outperforms other state-of-art temporal action detection methods with high recall and high temporal precision.