Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

3D Object Detection for Autonomous Driving: A Survey

Jun 21, 2021
Rui Qian, Xin Lai, Xirong Li

Autonomous driving is regarded as one of the most promising remedies to shield human beings from severe crashes. To this end, 3D object detection serves as the core basis of such perception system especially for the sake of path planning, motion prediction, collision avoidance, etc. Generally, stereo or monocular images with corresponding 3D point clouds are already standard layout for 3D object detection, out of which point clouds are increasingly prevalent with accurate depth information being provided. Despite existing efforts, 3D object detection on point clouds is still in its infancy due to high sparseness and irregularity of point clouds by nature, misalignment view between camera view and LiDAR bird's eye of view for modality synergies, occlusions and scale variations at long distances, etc. Recently, profound progress has been made in 3D object detection, with a large body of literature being investigated to address this vision task. As such, we present a comprehensive review of the latest progress in this field covering all the main topics including sensors, fundamentals, and the recent state-of-the-art detection methods with their pros and cons. Furthermore, we introduce metrics and provide quantitative comparisons on popular public datasets. The avenues for future work are going to be judiciously identified after an in-deep analysis of the surveyed works. Finally, we conclude this paper.

* 3D object detection, Autonomous driving, Point clouds 

  Access Paper or Ask Questions

PP-Rec: News Recommendation with Personalized User Interest and Time-aware News Popularity

Jun 10, 2021
Tao Qi, Fangzhao Wu, Chuhan Wu, Yongfeng Huang

Personalized news recommendation methods are widely used in online news services. These methods usually recommend news based on the matching between news content and user interest inferred from historical behaviors. However, these methods usually have difficulties in making accurate recommendations to cold-start users, and tend to recommend similar news with those users have read. In general, popular news usually contain important information and can attract users with different interests. Besides, they are usually diverse in content and topic. Thus, in this paper we propose to incorporate news popularity information to alleviate the cold-start and diversity problems for personalized news recommendation. In our method, the ranking score for recommending a candidate news to a target user is the combination of a personalized matching score and a news popularity score. The former is used to capture the personalized user interest in news. The latter is used to measure time-aware popularity of candidate news, which is predicted based on news content, recency, and real-time CTR using a unified framework. Besides, we propose a popularity-aware user encoder to eliminate the popularity bias in user behaviors for accurate interest modeling. Experiments on two real-world datasets show our method can effectively improve the accuracy and diversity for news recommendation.

* ACL 2021 

  Access Paper or Ask Questions

Transformer Query-Target Knowledge Discovery (TEND): Drug Discovery from CORD-19

Dec 11, 2020
Leo K. Tam, Xiaosong Wang, Daguang Xu

Previous work established skip-gram word2vec models could be used to mine knowledge in the materials science literature for the discovery of thermoelectrics. Recent transformer architectures have shown great progress in language modeling and associated fine-tuned tasks, but they have yet to be adapted for drug discovery. We present a RoBERTa transformer-based method that extends the masked language token prediction using query-target conditioning to treat the specificity challenge. The transformer discovery method entails several benefits over the word2vec method including domain-specific (antiviral) analogy performance, negation handling, and flexible query analysis (specific) and is demonstrated on influenza drug discovery. To stimulate COVID-19 research, we release an influenza clinical trials and antiviral analogies dataset used in conjunction with the COVID-19 Open Research Dataset Challenge (CORD-19) literature dataset in the study. We examine k-shot fine-tuning to improve the downstream analogies performance as well as to mine analogies for model explainability. Further, the query-target analysis is verified in a forward chaining analysis against the influenza drug clinical trials dataset, before adapted for COVID-19 drugs (combinations and side-effects) and on-going clinical trials. In consideration of the present topic, we release the model, dataset, and code.

  Access Paper or Ask Questions

Using machine learning to correct model error in data assimilation and forecast applications

Oct 23, 2020
Alban Farchi, Patrick Laloyaux, Massimo Bonavita, Marc Bocquet

The idea of using machine learning (ML) methods to reconstruct the dynamics of a system is the topic of recent studies in the geosciences, in which the key output is a surrogate model meant to emulate the dynamical model. In order to treat sparse and noisy observations in a rigorous way, ML can be combined to data assimilation (DA). This yields a class of iterative methods in which, at each iteration a DA step assimilates the observations, and alternates with a ML step to learn the underlying dynamics of the DA analysis. In this article, we propose to use this method to correct the error of an existent, knowledge-based model. In practice, the resulting surrogate model is an hybrid model between the original (knowledge-based) model and the ML model. We demonstrate numerically the feasibility of the method using a two-layer, two-dimensional quasi-geostrophic channel model. Model error is introduced by the means of perturbed parameters. The DA step is performed using the strong-constraint 4D-Var algorithm, while the ML step is performed using deep learning tools. The ML models are able to learn a substantial part of the model error and the resulting hybrid surrogate models produce better short- to mid-range forecasts. Furthermore, using the hybrid surrogate models for DA yields a significantly better analysis than using the original model.

  Access Paper or Ask Questions

Pay Attention to Evolution: Time Series Forecasting with Deep Graph-Evolution Learning

Aug 28, 2020
Gabriel Spadon, Shenda Hong, Bruno Brandoli, Stan Matwin, Jose F. Rodrigues-Jr, Jimeng Sun

Time-series forecasting is one of the most active research topics in predictive analysis. A still open gap in that literature is that statistical and ensemble learning approaches systematically present lower predictive performance than deep learning methods as they generally disregard the data sequence aspect entangled with multivariate data represented in more than one time series. Conversely, this work presents a novel neural network architecture for time-series forecasting that combines the power of graph evolution with deep recurrent learning on distinct data distributions; we named our method Recurrent Graph Evolution Neural Network (ReGENN). The idea is to infer multiple multivariate relationships between co-occurring time-series by assuming that the temporal data depends not only on inner variables and intra-temporal relationships (i.e., observations from itself) but also on outer variables and inter-temporal relationships (i.e., observations from other-selves). An extensive set of experiments was conducted comparing ReGENN with dozens of ensemble methods and classical statistical ones, showing sound improvement of up to 64.87% over the competing algorithms. Furthermore, we present an analysis of the intermediate weights arising from ReGENN, showing that by looking at inter and intra-temporal relationships simultaneously, time-series forecasting is majorly improved if paying attention to how multiple multivariate data synchronously evolve.

* Submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence 

  Access Paper or Ask Questions

Decomposition-Based Multi-Objective Evolutionary Algorithm Design under Two Algorithm Frameworks

Aug 17, 2020
Lie Meng Pang, Hisao Ishibuchi, Ke Shang

The development of efficient and effective evolutionary multi-objective optimization (EMO) algorithms has been an active research topic in the evolutionary computation community. Over the years, many EMO algorithms have been proposed. The existing EMO algorithms are mainly developed based on the final population framework. In the final population framework, the final population of an EMO algorithm is presented to the decision maker. Thus, it is required that the final population produced by an EMO algorithm is a good solution set. Recently, the use of solution selection framework was suggested for the design of EMO algorithms. This framework has an unbounded external archive to store all the examined solutions. A pre-specified number of solutions are selected from the archive as the final solutions presented to the decision maker. When the solution selection framework is used, EMO algorithms can be designed in a more flexible manner since the final population is not necessarily to be a good solution set. In this paper, we examine the design of MOEA/D under these two frameworks. We use an offline genetic algorithm-based hyper-heuristic method to find the optimal configuration of MOEA/D in each framework. The DTLZ and WFG test suites and their minus versions are used in our experiments. The experimental results suggest the possibility that a more flexible, robust and high-performance MOEA/D algorithm can be obtained when the solution selection framework is used.

  Access Paper or Ask Questions

Self-supervised Depth Estimation to Regularise Semantic Segmentation in Knee Arthroscopy

Jul 05, 2020
Fengbei Liu, Yaqub Jonmohamadi, Gabriel Maicas, Ajay K. Pandey, Gustavo Carneiro

Intra-operative automatic semantic segmentation of knee joint structures can assist surgeons during knee arthroscopy in terms of situational awareness. However, due to poor imaging conditions (e.g., low texture, overexposure, etc.), automatic semantic segmentation is a challenging scenario, which justifies the scarce literature on this topic. In this paper, we propose a novel self-supervised monocular depth estimation to regularise the training of the semantic segmentation in knee arthroscopy. To further regularise the depth estimation, we propose the use of clean training images captured by the stereo arthroscope of routine objects (presenting none of the poor imaging conditions and with rich texture information) to pre-train the model. We fine-tune such model to produce both the semantic segmentation and self-supervised monocular depth using stereo arthroscopic images taken from inside the knee. Using a data set containing 3868 arthroscopic images captured during cadaveric knee arthroscopy with semantic segmentation annotations, 2000 stereo image pairs of cadaveric knee arthroscopy, and 2150 stereo image pairs of routine objects, we show that our semantic segmentation regularised by self-supervised depth estimation produces a more accurate segmentation than a state-of-the-art semantic segmentation approach modeled exclusively with semantic segmentation annotation.

* 10 pages, 6 figures 

  Access Paper or Ask Questions

DFraud3- Multi-Component Fraud Detection freeof Cold-start

Jun 11, 2020
Saeedreza Shehnepoor, Roberto Togneri, Wei Liu, Mohammed Bennamoun

Fraud review detection is a hot research topic inrecent years. The Cold-start is a particularly new but significant problem referring to the failure of a detection system to recognize the authenticity of a new user. State-of-the-art solutions employ a translational knowledge graph embedding approach (TransE) to model the interaction of the components of a review system. However, these approaches suffer from the limitation of TransEin handling N-1 relations and the narrow scope of a single classification task, i.e., detecting fraudsters only. In this paper, we model a review system as a Heterogeneous InformationNetwork (HIN) which enables a unique representation to every component and performs graph inductive learning on the review data through aggregating features of nearby nodes. HIN with graph induction helps to address the camouflage issue (fraudsterswith genuine reviews) which has shown to be more severe when it is coupled with cold-start, i.e., new fraudsters with genuine first reviews. In this research, instead of focusing only on one component, detecting either fraud reviews or fraud users (fraudsters), vector representations are learnt for each component, enabling multi-component classification. In other words, we are able to detect fraud reviews, fraudsters, and fraud-targeted items, thus the name of our approach DFraud3. DFraud3 demonstrates a significant accuracy increase of 13% over the state of the art on Yelp.

  Access Paper or Ask Questions

Learning the Compositional Visual Coherence for Complementary Recommendations

Jun 08, 2020
Zhi Li, Bo Wu, Qi Liu, Likang Wu, Hongke Zhao, Tao Mei

Complementary recommendations, which aim at providing users product suggestions that are supplementary and compatible with their obtained items, have become a hot topic in both academia and industry in recent years. %However, it is challenging due to its complexity and subjectivity. Existing work mainly focused on modeling the co-purchased relations between two items, but the compositional associations of item collections are largely unexplored. Actually, when a user chooses the complementary items for the purchased products, it is intuitive that she will consider the visual semantic coherence (such as color collocations, texture compatibilities) in addition to global impressions. Towards this end, in this paper, we propose a novel Content Attentive Neural Network (CANN) to model the comprehensive compositional coherence on both global contents and semantic contents. Specifically, we first propose a \textit{Global Coherence Learning} (GCL) module based on multi-heads attention to model the global compositional coherence. Then, we generate the semantic-focal representations from different semantic regions and design a \textit{Focal Coherence Learning} (FCL) module to learn the focal compositional coherence from different semantic-focal representations. Finally, we optimize the CANN in a novel compositional optimization strategy. Extensive experiments on the large-scale real-world data clearly demonstrate the effectiveness of CANN compared with several state-of-the-art methods.

* Early version accepted by IJCAI2020 

  Access Paper or Ask Questions