Unlike 2D object detection where all RoI features come from grid pixels, the RoI feature extraction of 3D point cloud object detection is more diverse. In this paper, we first compare and analyze the differences in structure and performance between the two state-of-the-art models PV-RCNN and Voxel-RCNN. Then, we find that the performance gap between the two models does not come from point information, but structural information. The voxel features contain more structural information because they do quantization instead of downsampling to point cloud so that they can contain basically the complete information of the whole point cloud. The stronger structural information in voxel features makes the detector have higher performance in our experiments even if the voxel features don't have accurate location information. Then, we propose that structural information is the key to 3D object detection. Based on the above conclusion, we propose a Self-Attention RoI Feature Extractor (SARFE) to enhance structural information of the feature extracted from 3D proposals. SARFE is a plug-and-play module that can be easily used on existing 3D detectors. Our SARFE is evaluated on both KITTI dataset and Waymo Open dataset. With the newly introduced SARFE, we improve the performance of the state-of-the-art 3D detectors by a large margin in cyclist on KITTI dataset while keeping real-time capability.
We introduce MemSum (Multi-step Episodic Markov decision process extractive SUMmarizer), a reinforcement-learning-based extractive summarizer enriched at any given time step with information on the current extraction history. Similar to previous models in this vein, MemSum iteratively selects sentences into the summary. Our innovation is in considering a broader information set when summarizing that would intuitively also be used by humans in this task: 1) the text content of the sentence, 2) the global text context of the rest of the document, and 3) the extraction history consisting of the set of sentences that have already been extracted. With a lightweight architecture, MemSum nonetheless obtains state-of-the-art test-set performance (ROUGE score) on long document datasets (PubMed, arXiv, and GovReport). Supporting analysis demonstrates that the added awareness of extraction history gives MemSum robustness against redundancy in the source document.
State-of-the-art methods for relation extraction consider the sentential context by modeling the entire sentence. However, syntactic indicators, certain phrases or words like prepositions that are more informative than other words and may be beneficial for identifying semantic relations. Other approaches using fixed text triggers capture such information but ignore the lexical diversity. To leverage both syntactic indicators and sentential contexts, we propose an indicator-aware approach for relation extraction. Firstly, we extract syntactic indicators under the guidance of syntactic knowledge. Then we construct a neural network to incorporate both syntactic indicators and the entire sentences into better relation representations. By this way, the proposed model alleviates the impact of noisy information from entire sentences and breaks the limit of text triggers. Experiments on the SemEval-2010 Task 8 benchmark dataset show that our model significantly outperforms the state-of-the-art methods.
Neural relation extraction models have shown promising results in recent years; however, the model performance drops dramatically given only a few training samples. Recent works try leveraging the advance in few-shot learning to solve the low resource problem, where they train label-agnostic models to directly compare the semantic similarities among context sentences in the embedding space. However, the label-aware information, i.e., the relation label that contains the semantic knowledge of the relation itself, is often neglected for prediction. In this work, we propose a framework considering both label-agnostic and label-aware semantic mapping information for low resource relation extraction. We show that incorporating the above two types of mapping information in both pretraining and fine-tuning can significantly improve the model performance on low-resource relation extraction tasks.
In web era, since technology has revolutionized mankind life, plenty of data and information are published on the Internet each day. For instance, news agencies publish news on their websites all over the world. These raw data could be an important resource for knowledge extraction. These shared data contain emotions (i.e., positive, neutral or negative) toward various topics; therefore, sentimental content extraction could be a beneficial task in many aspects. Extracting the sentiment of news illustrates highly valuable information about the events over a period of time, the viewpoint of a media or news agency to these events. In this paper an attempt is made to propose an approach for news analysis and extracting useful knowledge from them. Firstly, we attempt to extract a noise robust sentiment of news documents; therefore, the news associated to six countries: United State, United Kingdom, Germany, Canada, France and Australia in 5 different news categories: Politics, Sports, Business, Entertainment and Technology are downloaded. In this paper we compare the condition of different countries in each 5 news topics based on the extracted sentiments and emotional contents in news documents. Moreover, we propose an approach to reduce the bulky news data to extract the hottest topics and news titles as a knowledge. Eventually, we generate a word model to map each word to a fixed-size vector by Word2Vec in order to understand the relations between words in our collected news database.
Two-Way Ranging enables the distance estimation between two active parties and allows time of flight measurements despite relative clock offset and drift. Limited by the number of messages, scalable solutions build on Time Difference on Arrival to infer timing information at passive listeners. However, the demand for accurate distance estimates dictates a tight bound on the time synchronization, thus limiting scalability to the localization of passive tags relative to static, synchronized anchors. This work describes the extraction of Time Difference on Arrival information from a Two-Way Ranging process, enabling the extraction of distance information on passive listeners and further allowing scalable tag localization without the need for static or synchronized anchors. The expected error is formally deducted. The extension allows the extraction of the timing difference despite relative clock offset and drift for the Double-Sided Two-Way Ranging and Single-Sided Two-Way Ranging with additional carrier frequency offset estimation.
Open Information Extraction (Open IE) systems aim to obtain relation tuples with highly scalable extraction in portable across domain by identifying a variety of relation phrases and their arguments in arbitrary sentences. The first generation of Open IE learns linear chain models based on unlexicalized features such as Part-of-Speech (POS) or shallow tags to label the intermediate words between pair of potential arguments for identifying extractable relations. Open IE currently is developed in the second generation that is able to extract instances of the most frequently observed relation types such as Verb, Noun and Prep, Verb and Prep, and Infinitive with deep linguistic analysis. They expose simple yet principled ways in which verbs express relationships in linguistics such as verb phrase-based extraction or clause-based extraction. They obtain a significantly higher performance over previous systems in the first generation. In this paper, we describe an overview of two Open IE generations including strengths, weaknesses and application areas.
In this study, we propose a feature extraction framework based on contrastive learning with adaptive positive and negative samples (CL-FEFA) that is suitable for unsupervised, supervised, and semi-supervised single-view feature extraction. CL-FEFA constructs adaptively the positive and negative samples from the results of feature extraction, which makes it more appropriate and accurate. Thereafter, the discriminative features are re extracted to according to InfoNCE loss based on previous positive and negative samples, which will make the intra-class samples more compact and the inter-class samples more dispersed. At the same time, using the potential structure information of subspace samples to dynamically construct positive and negative samples can make our framework more robust to noisy data. Furthermore, CL-FEFA considers the mutual information between positive samples, that is, similar samples in potential structures, which provides theoretical support for its advantages in feature extraction. The final numerical experiments prove that the proposed framework has a strong advantage over the traditional feature extraction methods and contrastive learning methods.
Most existing supervised and few-shot learning relation extraction methods have relied on labeled training data. However, in real-world scenarios, there exist many relations for which there is no available training data. We address this issue from the perspective of zero-shot learning (ZSL) which is similar to the way humans learn and recognize new concepts with no prior knowledge. We propose a zero-shot learning relation extraction (ZSLRE) framework, which focuses on recognizing novel relations that have no corresponding labeled data available for training. Our proposed ZSLRE model aims to recognize new relations based on prototypical networks that are modified to utilize side (auxiliary) information. The additional use of side information allows those modified prototype networks to recognize novel relations in addition to recognized previously known relations. We construct side information from labels and their synonyms, hypernyms of name entities, and keywords. We build an automatic hypernym extraction framework to help get hypernyms of various name entities directly from the web. We demonstrate using extensive experiments on two public datasets (NYT and FewRel) that our proposed model significantly outperforms state-of-the-art methods on supervised learning, few-shot learning, and zero-shot learning tasks. Our experimental results also demonstrate the effectiveness and robustness of our proposed model in a combination scenario. Once accepted for publication, we will publish ZSLRE's source code and datasets to enable reproducibility and encourage further research.