The traditional methods of action recognition are not specific for the operator, thus results are easy to be disturbed when other actions are operated in videos. The network based on mixed convolutional resnet and RPN is proposed in this paper. The rMC is tested in the data set of UCF-101 to compare with the method of R3D. The result shows that its correct rate reaches 71.07%. Meanwhile, the action recognition network is tested in our gesture and body posture data sets for specific target. The simulation achieves a good performance in which the running speed reaches 200 FPS. Finally, our model is improved by introducing the regression block and performs better, which shows the great potential of this model.
Common target detection is usually based on single frame images, which is vulnerable to affected by the similar targets in the image and not applicable to video images. In this paper , anchor mask is proposed to add the prior knowledge for target detection and an anchor mask net is designed to impove the RPN performance for single target detection. Tested in the VOT2016, the model perform better.