3D shape recognition has attracted more and more attention as a task of 3D vision research. The proliferation of 3D data encourages various deep learning methods based on 3D data. Now there have been many deep learning models based on point-cloud data or multi-view data alone. However, in the era of big data, integrating data of two different modals to obtain a unified 3D shape descriptor is bound to improve the recognition accuracy. Therefore, this paper proposes a fusion network based on multimodal attention mechanism for 3D shape recognition. Considering the limitations of multi-view data, we introduce a soft attention scheme, which can use the global point-cloud features to filter the multi-view features, and then realize the effective fusion of the two features. More specifically, we obtain the enhanced multi-view features by mining the contribution of each multi-view image to the overall shape recognition, and then fuse the point-cloud features and the enhanced multi-view features to obtain a more discriminative 3D shape descriptor. We have performed relevant experiments on the ModelNet40 dataset, and experimental results verify the effectiveness of our method.
This paper proposes a new framework to solve the problem of monocular visual odometry, called MagicVO . Based on Convolutional Neural Network (CNN) and Bi-directional LSTM (Bi-LSTM), MagicVO outputs a 6-DoF absolute-scale pose at each position of the camera with a sequence of continuous monocular images as input. It not only utilizes the outstanding performance of CNN in image feature processing to extract the rich features of image frames fully but also learns the geometric relationship from image sequences pre and post through Bi-LSTM to get a more accurate prediction. A pipeline of the MagicVO is shown in Fig. 1. The MagicVO system is end-to-end, and the results of experiments on the KITTI dataset and the ETH-asl cla dataset show that MagicVO has a better performance than traditional visual odometry (VO) systems in the accuracy of pose and the generalization ability.