Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Natalia Loukachevitch

RuNNE-2022 Shared Task: Recognizing Nested Named Entities

May 23, 2022

Ekaterina Artemova, Maxim Zmeev, Natalia Loukachevitch, Igor Rozhkov, Tatiana Batura, Vladimir Ivanov, Elena Tutubalina

Figure 1 for RuNNE-2022 Shared Task: Recognizing Nested Named Entities

Figure 2 for RuNNE-2022 Shared Task: Recognizing Nested Named Entities

Figure 3 for RuNNE-2022 Shared Task: Recognizing Nested Named Entities

Figure 4 for RuNNE-2022 Shared Task: Recognizing Nested Named Entities

Abstract:The RuNNE Shared Task approaches the problem of nested named entity recognition. The annotation schema is designed in such a way, that an entity may partially overlap or even be nested into another entity. This way, the named entity "The Yermolova Theatre" of type "organization" houses another entity "Yermolova" of type "person". We adopt the Russian NEREL dataset for the RuNNE Shared Task. NEREL comprises news texts written in the Russian language and collected from the Wikinews portal. The annotation schema includes 29 entity types. The nestedness of named entities in NEREL reaches up to six levels. The RuNNE Shared Task explores two setups. (i) In the general setup all entities occur more or less with the same frequency. (ii) In the few-shot setup the majority of entity types occur often in the training set. However, some of the entity types are have lower frequency, being thus challenging to recognize. In the test set the frequency of all entity types is even. This paper reports on the results of the RuNNE Shared Task. Overall the shared task has received 156 submissions from nine teams. Half of the submissions outperform a straightforward BERT-based baseline in both setups. This paper overviews the shared task setup and discusses the submitted systems, discovering meaning insights for the problem of nested NER. The links to the evaluation platform and the data from the shared task are available in our github repository: https://github.com/dialogue-evaluation/RuNNE.

* To appear in Dialogue 2022

Via

Access Paper or Ask Questions

Taxonomy Enrichment with Text and Graph Vector Representations

Jan 21, 2022

Irina Nikishina, Mikhail Tikhomirov, Varvara Logacheva, Yuriy Nazarov, Alexander Panchenko, Natalia Loukachevitch

Figure 1 for Taxonomy Enrichment with Text and Graph Vector Representations

Figure 2 for Taxonomy Enrichment with Text and Graph Vector Representations

Figure 3 for Taxonomy Enrichment with Text and Graph Vector Representations

Figure 4 for Taxonomy Enrichment with Text and Graph Vector Representations

Abstract:Knowledge graphs such as DBpedia, Freebase or Wikidata always contain a taxonomic backbone that allows the arrangement and structuring of various concepts in accordance with the hypo-hypernym ("class-subclass") relationship. With the rapid growth of lexical resources for specific domains, the problem of automatic extension of the existing knowledge bases with new words is becoming more and more widespread. In this paper, we address the problem of taxonomy enrichment which aims at adding new words to the existing taxonomy. We present a new method that allows achieving high results on this task with little effort. It uses the resources which exist for the majority of languages, making the method universal. We extend our method by incorporating deep representations of graph structures like node2vec, Poincar\'e embeddings, GCN etc. that have recently demonstrated promising results on various NLP tasks. Furthermore, combining these representations with word embeddings allows us to beat the state of the art. We conduct a comprehensive study of the existing approaches to taxonomy enrichment based on word and graph vector representations and their fusion approaches. We also explore the ways of using deep learning architectures to extend the taxonomic backbones of knowledge graphs. We create a number of datasets for taxonomy extension for English and Russian. We achieve state-of-the-art results across different datasets and provide an in-depth error analysis of mistakes.

Via

Access Paper or Ask Questions

NEREL: A Russian Dataset with Nested Named Entities, Relations and Events

Sep 03, 2021

Natalia Loukachevitch, Ekaterina Artemova, Tatiana Batura, Pavel Braslavski, Ilia Denisov, Vladimir Ivanov, Suresh Manandhar, Alexander Pugachev, Elena Tutubalina

Figure 1 for NEREL: A Russian Dataset with Nested Named Entities, Relations and Events

Figure 2 for NEREL: A Russian Dataset with Nested Named Entities, Relations and Events

Figure 3 for NEREL: A Russian Dataset with Nested Named Entities, Relations and Events

Figure 4 for NEREL: A Russian Dataset with Nested Named Entities, Relations and Events

Abstract:In this paper, we present NEREL, a Russian dataset for named entity recognition and relation extraction. NEREL is significantly larger than existing Russian datasets: to date it contains 56K annotated named entities and 39K annotated relations. Its important difference from previous datasets is annotation of nested named entities, as well as relations within nested entities and at the discourse level. NEREL can facilitate development of novel models that can extract relations between nested named entities, as well as relations on both sentence and document levels. NEREL also contains the annotation of events involving named entities and their roles in the events. The NEREL collection is available via https://github.com/nerel-ds/NEREL.

* accepted to RANLP

Via

Access Paper or Ask Questions

Transfer Learning for Improving Results on Russian Sentiment Datasets

Jul 06, 2021

Anton Golubev, Natalia Loukachevitch

Figure 1 for Transfer Learning for Improving Results on Russian Sentiment Datasets

Figure 2 for Transfer Learning for Improving Results on Russian Sentiment Datasets

Figure 3 for Transfer Learning for Improving Results on Russian Sentiment Datasets

Figure 4 for Transfer Learning for Improving Results on Russian Sentiment Datasets

Abstract:In this study, we test transfer learning approach on Russian sentiment benchmark datasets using additional train sample created with distant supervision technique. We compare several variants of combining additional data with benchmark train samples. The best results were achieved using three-step approach of sequential training on general, thematic and original train samples. For most datasets, the results were improved by more than 3% to the current state-of-the-art methods. The BERT-NLI model treating sentiment classification problem as a natural language inference task reached the human level of sentiment analysis on one of the datasets.

* Dialogue 2021

Via

Access Paper or Ask Questions

Studying Taxonomy Enrichment on Diachronic WordNet Versions

Nov 23, 2020

Irina Nikishina, Alexander Panchenko, Varvara Logacheva, Natalia Loukachevitch

Figure 1 for Studying Taxonomy Enrichment on Diachronic WordNet Versions

Figure 2 for Studying Taxonomy Enrichment on Diachronic WordNet Versions

Figure 3 for Studying Taxonomy Enrichment on Diachronic WordNet Versions

Figure 4 for Studying Taxonomy Enrichment on Diachronic WordNet Versions

Abstract:Ontologies, taxonomies, and thesauri are used in many NLP tasks. However, most studies are focused on the creation of these lexical resources rather than the maintenance of the existing ones. Thus, we address the problem of taxonomy enrichment. We explore the possibilities of taxonomy extension in a resource-poor setting and present methods which are applicable to a large number of languages. We create novel English and Russian datasets for training and evaluating taxonomy enrichment models and describe a technique of creating such datasets for other languages.

Via

Access Paper or Ask Questions

Improving Results on Russian Sentiment Datasets

Jul 28, 2020

Anton Golubev, Natalia Loukachevitch

Figure 1 for Improving Results on Russian Sentiment Datasets

Figure 2 for Improving Results on Russian Sentiment Datasets

Figure 3 for Improving Results on Russian Sentiment Datasets

Figure 4 for Improving Results on Russian Sentiment Datasets

Abstract:In this study, we test standard neural network architectures (CNN, LSTM, BiLSTM) and recently appeared BERT architectures on previous Russian sentiment evaluation datasets. We compare two variants of Russian BERT and show that for all sentiment tasks in this study the conversational variant of Russian BERT performs better. The best results were achieved by BERT-NLI model, which treats sentiment classification tasks as a natural language inference task. On one of the datasets, this model practically achieves the human level.

* 13 pages, 8 tables. Accepted to AINL-2020 conference (https://ainlconf.ru/)

Via

Access Paper or Ask Questions

Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Supervision

Jun 30, 2020

Nicolay Rusnachenko, Natalia Loukachevitch

Figure 1 for Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Supervision

Figure 2 for Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Supervision

Figure 3 for Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Supervision

Figure 4 for Attention-Based Neural Networks for Sentiment Attitude Extraction using Distant Supervision

Abstract:In the sentiment attitude extraction task, the aim is to identify <<attitudes>> -- sentiment relations between entities mentioned in text. In this paper, we provide a study on attention-based context encoders in the sentiment attitude extraction task. For this task, we adapt attentive context encoders of two types: (1) feature-based; (2) self-based. In our study, we utilize the corpus of Russian analytical texts RuSentRel and automatically constructed news collection RuAttitudes for enriching the training set. We consider the problem of attitude extraction as two-class (positive, negative) and three-class (positive, negative, neutral) classification tasks for whole documents. Our experiments with the RuSentRel corpus show that the three-class classification models, which employ the RuAttitudes corpus for training, result in 10% increase and extra 3% by F1, when model architectures include the attention mechanism. We also provide the analysis of attention weight distributions in dependence on the term type.

* The 10th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020), June 30-July 3, 2020, Biarritz, France
* 10 pages, 9 figures. The preprint of an article published in the proceedings of the 10th International Conference on Web Intelligence, Mining and Semantics (WIMS 2020). The final authenticated publication is available online at https://doi.org/10.1145/3405962.3405985. arXiv admin note: substantial text overlap with arXiv:2006.11605

Via

Access Paper or Ask Questions

Studying Attention Models in Sentiment Attitude Extraction Task

Jun 20, 2020

Nicolay Rusnachenko, Natalia Loukachevitch

Figure 1 for Studying Attention Models in Sentiment Attitude Extraction Task

Figure 2 for Studying Attention Models in Sentiment Attitude Extraction Task

Figure 3 for Studying Attention Models in Sentiment Attitude Extraction Task

Figure 4 for Studying Attention Models in Sentiment Attitude Extraction Task

* M\'etais E., Meziane F., Horacek H., Cimiano P. (eds) Natural Language Processing and Information Systems. NLDB 2020. Lecture Notes in Computer Science, vol 12089. Springer, Cham
* This is a preprint of an article published in the Proceedings of the 25th International Conference on Natural Language and Information Systems. The final authenticated publication is available online at https://doi.org/10.1007/978-3-030-51310-8_15

Via

Access Paper or Ask Questions

Sentiment Frames for Attitude Extraction in Russian

Jun 19, 2020

Natalia Loukachevitch, Nicolay Rusnachenko

Figure 1 for Sentiment Frames for Attitude Extraction in Russian

Figure 2 for Sentiment Frames for Attitude Extraction in Russian

Figure 3 for Sentiment Frames for Attitude Extraction in Russian

Figure 4 for Sentiment Frames for Attitude Extraction in Russian

Abstract:Texts can convey several types of inter-related information concerning opinions and attitudes. Such information includes the author's attitude towards mentioned entities, attitudes of the entities towards each other, positive and negative effects on the entities in the described situations. In this paper, we described the lexicon RuSentiFrames for Russian, where predicate words and expressions are collected and linked to so-called sentiment frames conveying several types of presupposed information on attitudes and effects. We applied the created frames in the task of extracting attitudes from a large news collection.

* Proceedings of the International Conference on Computational Linguistics and Intellectual Technologies "Dialogue-2020", 2020, pp.526-537
* 12 pages, 1 figure, 6 tables

Via

Access Paper or Ask Questions

RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

May 22, 2020

Irina Nikishina, Varvara Logacheva, Alexander Panchenko, Natalia Loukachevitch

Figure 1 for RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

Figure 2 for RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

Figure 3 for RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

Figure 4 for RUSSE'2020: Findings of the First Taxonomy Enrichment Task for the Russian language

Abstract:This paper describes the results of the first shared task on taxonomy enrichment for the Russian language. The participants were asked to extend an existing taxonomy with previously unseen words: for each new word their systems should provide a ranked list of possible (candidate) hypernyms. In comparison to the previous tasks for other languages, our competition has a more realistic task setting: new words were provided without definitions. Instead, we provided a textual corpus where these new terms occurred. For this evaluation campaign, we developed a new evaluation dataset based on unpublished RuWordNet data. The shared task features two tracks: "nouns" and "verbs". 16 teams participated in the task demonstrating high results with more than half of them outperforming the provided baseline.

Via

Access Paper or Ask Questions