Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ekaterina Artemova

Revisiting Mahalanobis Distance for Transformer-Based Out-of-Domain Detection

Jan 11, 2021

Alexander Podolskiy, Dmitry Lipin, Andrey Bout, Ekaterina Artemova, Irina Piontkovskaya

Abstract:Real-life applications, heavily relying on machine learning, such as dialog systems, demand out-of-domain detection methods. Intent classification models should be equipped with a mechanism to distinguish seen intents from unseen ones so that the dialog agent is capable of rejecting the latter and avoiding undesired behavior. However, despite increasing attention paid to the task, the best practices for out-of-domain intent detection have not yet been fully established. This paper conducts a thorough comparison of out-of-domain intent detection methods. We prioritize the methods, not requiring access to out-of-domain data during training, gathering of which is extremely time- and labor-consuming due to lexical and stylistic variation of user utterances. We evaluate multiple contextual encoders and methods, proven to be efficient, on three standard datasets for intent classification, expanded with out-of-domain utterances. Our main findings show that fine-tuning Transformer-based encoders on in-domain data leads to superior results. Mahalanobis distance, together with utterance representations, derived from Transformer-based encoders, outperforms other methods by a wide margin and establishes new state-of-the-art results for all datasets. The broader analysis shows that the reason for success lies in the fact that the fine-tuned Transformer is capable of constructing homogeneous representations of in-domain utterances, revealing geometrical disparity to out of domain utterances. In turn, the Mahalanobis distance captures this disparity easily.

* to appear in AAAI 2021

Via

Access Paper or Ask Questions

RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

Nov 02, 2020

Tatiana Shavrina, Alena Fenogenova, Anton Emelyanov, Denis Shevelev, Ekaterina Artemova, Valentin Malykh, Vladislav Mikhailov, Maria Tikhonova, Andrey Chertok, Andrey Evlampiev

Figure 1 for RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

Figure 2 for RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

Figure 3 for RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

Figure 4 for RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark

Abstract:In this paper, we introduce an advanced Russian general language understanding evaluation benchmark -- RussianGLUE. Recent advances in the field of universal language models and transformers require the development of a methodology for their broad diagnostics and testing for general intellectual skills - detection of natural language inference, commonsense reasoning, ability to perform simple logical operations regardless of text subject or lexicon. For the first time, a benchmark of nine tasks, collected and organized analogically to the SuperGLUE methodology, was developed from scratch for the Russian language. We provide baselines, human level evaluation, an open-source framework for evaluating models (https://github.com/RussianNLP/RussianSuperGLUE), and an overall leaderboard of transformer models for the Russian language. Besides, we present the first results of comparing multilingual models in the adapted diagnostic test set and offer the first steps to further expanding or assessing state-of-the-art models independently of language.

* to appear in EMNLP 2020

Via

Access Paper or Ask Questions

RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain

Oct 29, 2020

Vitaly Ivanin, Ekaterina Artemova, Tatiana Batura, Vladimir Ivanov, Veronika Sarkisyan, Elena Tutubalina, Ivan Smurov

Figure 1 for RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain

Figure 2 for RuREBus: a Case Study of Joint Named Entity Recognition and Relation Extraction from e-Government Domain

Abstract:We show-case an application of information extraction methods, such as named entity recognition (NER) and relation extraction (RE) to a novel corpus, consisting of documents, issued by a state agency. The main challenges of this corpus are: 1) the annotation scheme differs greatly from the one used for the general domain corpora, and 2) the documents are written in a language other than English. Unlike expectations, the state-of-the-art transformer-based models show modest performance for both tasks, either when approached sequentially, or in an end-to-end fashion. Our experiments have demonstrated that fine-tuning on a large unlabeled corpora does not automatically yield significant improvement and thus we may conclude that more sophisticated strategies of leveraging unlabelled texts are demanded. In this paper, we describe the whole developed pipeline, starting from text annotation, baseline development, and designing a shared task in hopes of improving the baseline. Eventually, we realize that the current NER and RE technologies are far from being mature and do not overcome so far challenges like ours.

* to appear in AIST 2020

Via

Access Paper or Ask Questions

DaNetQA: a yes/no Question Answering Dataset for the Russian Language

Oct 15, 2020

Taisia Glushkova, Alexey Machnev, Alena Fenogenova, Tatiana Shavrina, Ekaterina Artemova, Dmitry I. Ignatov

Figure 1 for DaNetQA: a yes/no Question Answering Dataset for the Russian Language

Figure 2 for DaNetQA: a yes/no Question Answering Dataset for the Russian Language

Figure 3 for DaNetQA: a yes/no Question Answering Dataset for the Russian Language

Figure 4 for DaNetQA: a yes/no Question Answering Dataset for the Russian Language

Abstract:DaNetQA, a new question-answering corpus, follows (Clark et. al, 2019) design: it comprises natural yes/no questions. Each question is paired with a paragraph from Wikipedia and an answer, derived from the paragraph. The task is to take both the question and a paragraph as input and come up with a yes/no answer, i.e. to produce a binary output. In this paper, we present a reproducible approach to DaNetQA creation and investigate transfer learning methods for task and language transferring. For task transferring we leverage three similar sentence modelling tasks: 1) a corpus of paraphrases, Paraphraser, 2) an NLI task, for which we use the Russian part of XNLI, 3) another question answering task, SberQUAD. For language transferring we use English to Russian translation together with multilingual language fine-tuning.

* Analysis of Images, Social Networks and Texts - 9 th International Conference, AIST 2020, Skolkovo, Russia, October 15-16, 2020, Revised Selected Papers. Lecture Notes in Computer Science (https://dblp.org/db/series/lncs/index.html), Springer 2020

Via

Access Paper or Ask Questions

ELMo and BERT in semantic change detection for Russian

Oct 07, 2020

Julia Rodina, Yuliya Trofimova, Andrey Kutuzov, Ekaterina Artemova

Figure 1 for ELMo and BERT in semantic change detection for Russian

Abstract:We study the effectiveness of contextualized embeddings for the task of diachronic semantic change detection for Russian language data. Evaluation test sets consist of Russian nouns and adjectives annotated based on their occurrences in texts created in pre-Soviet, Soviet and post-Soviet time periods. ELMo and BERT architectures are compared on the task of ranking Russian words according to the degree of their semantic change over time. We use several methods for aggregation of contextualized embeddings from these architectures and evaluate their performance. Finally, we compare unsupervised and supervised techniques in this task.

* The 9th International Conference on Analysis of Images, Social Networks and Texts (AIST 2020)

Via

Access Paper or Ask Questions

So What's the Plan? Mining Strategic Planning Documents

Jul 07, 2020

Ekaterina Artemova, Tatiana Batura, Anna Golenkovskaya, Vitaly Ivanin, Vladimir Ivanov, Veronika Sarkisyan, Ivan Smurov, Elena Tutubalina

Figure 1 for So What's the Plan? Mining Strategic Planning Documents

Figure 2 for So What's the Plan? Mining Strategic Planning Documents

Figure 3 for So What's the Plan? Mining Strategic Planning Documents

Figure 4 for So What's the Plan? Mining Strategic Planning Documents

Abstract:In this paper we present a corpus of Russian strategic planning documents, RuREBus. This project is grounded both from language technology and e-government perspectives. Not only new language sources and tools are being developed, but also their applications to e-goverment research. We demonstrate the pipeline for creating a text corpus from scratch. First, the annotation schema is designed. Next texts are marked up using human-in-the-loop strategy, so that preliminary annotations are derived from a machine learning model and are manually corrected. The amount of annotated texts is large enough to showcase what insights can be gained from RuREBus.

* 15 pages, 3 figures, 5 tables. The paper has been accepted for the Fifth International Conference on Digital Transformation and Global Society (DTGS 2020)

Via

Access Paper or Ask Questions

NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

Jun 12, 2020

Nikita Klyuchnikov, Ilya Trofimov, Ekaterina Artemova, Mikhail Salnikov, Maxim Fedorov, Evgeny Burnaev

Figure 1 for NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

Figure 2 for NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

Figure 3 for NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

Figure 4 for NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing

Abstract:Neural Architecture Search (NAS) is a promising and rapidly evolving research area. Training a large number of neural networks requires an exceptional amount of computational power, which makes NAS unreachable for those researchers who have limited or no access to high-performance clusters and supercomputers. A few benchmarks with precomputed neural architectures performances have been recently introduced to overcome this problem and ensure more reproducible experiments. However, these benchmarks are only for the computer vision domain and, thus, are built from the image datasets and convolution-derived architectures. In this work, we step outside the computer vision domain by leveraging the language modeling task, which is the core of natural language processing (NLP). Our main contribution is as follows: we have provided search space of recurrent neural networks on the text datasets and trained 14k architectures within it; we have conducted both intrinsic and extrinsic evaluation of the trained models using datasets for semantic relatedness and language understanding evaluation; finally, we have tested several NAS algorithms to demonstrate how the precomputed results can be utilized. We believe that our results have high potential of usage for both NAS and NLP communities.

Via

Access Paper or Ask Questions

Data-driven models and computational tools for neurolinguistics: a language technology perspective

Mar 23, 2020

Ekaterina Artemova, Amir Bakarov, Aleksey Artemov, Evgeny Burnaev, Maxim Sharaev

Figure 1 for Data-driven models and computational tools for neurolinguistics: a language technology perspective

Abstract:In this paper, our focus is the connection and influence of language technologies on the research in neurolinguistics. We present a review of brain imaging-based neurolinguistic studies with a focus on the natural language representations, such as word embeddings and pre-trained language models. Mutual enrichment of neurolinguistics and language technologies leads to development of brain-aware natural language representations. The importance of this research area is emphasized by medical applications.

* Journal of Cognitive Science, 2020
* 37 pages, 1 figure

Via

Access Paper or Ask Questions

A Joint Approach to Compound Splitting and Idiomatic Compound Detection

Mar 21, 2020

Irina Krotova, Sergey Aksenov, Ekaterina Artemova

Figure 1 for A Joint Approach to Compound Splitting and Idiomatic Compound Detection

Figure 2 for A Joint Approach to Compound Splitting and Idiomatic Compound Detection

Figure 3 for A Joint Approach to Compound Splitting and Idiomatic Compound Detection

Figure 4 for A Joint Approach to Compound Splitting and Idiomatic Compound Detection

Abstract:Applications such as machine translation, speech recognition, and information retrieval require efficient handling of noun compounds as they are one of the possible sources for out-of-vocabulary (OOV) words. In-depth processing of noun compounds requires not only splitting them into smaller components (or even roots) but also the identification of instances that should remain unsplitted as they are of idiomatic nature. We develop a two-fold deep learning-based approach of noun compound splitting and idiomatic compound detection for the German language that we train using a newly collected corpus of annotated German compounds. Our neural noun compound splitter operates on a sub-word level and outperforms the current state of the art by about 5%.

* 8 pages, 5 tables, 1 figure, accepted at LREC 2020

Via

Access Paper or Ask Questions

Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Mar 14, 2020

Varvara Logacheva, Denis Teslenko, Artem Shelmanov, Steffen Remus, Dmitry Ustalov, Andrey Kutuzov, Ekaterina Artemova, Chris Biemann, Simone Paolo Ponzetto, Alexander Panchenko

Figure 1 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Figure 2 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Figure 3 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Figure 4 for Word Sense Disambiguation for 158 Languages using Word Embeddings Only

Abstract:Disambiguation of word senses in context is easy for humans, but is a major challenge for automatic approaches. Sophisticated supervised and knowledge-based models were developed to solve this task. However, (i) the inherent Zipfian distribution of supervised training instances for a given word and/or (ii) the quality of linguistic knowledge representations motivate the development of completely unsupervised and knowledge-free approaches to word sense disambiguation (WSD). They are particularly useful for under-resourced languages which do not have any resources for building either supervised and/or knowledge-based models. In this paper, we present a method that takes as input a standard pre-trained word embedding model and induces a fully-fledged word sense inventory, which can be used for disambiguation in context. We use this method to induce a collection of sense inventories for 158 languages on the basis of the original pre-trained fastText word embeddings by Grave et al. (2018), enabling WSD in these languages. Models and system are available online.

* 10 pages, 5 figures, 4 tables, accepted at LREC 2020

Via

Access Paper or Ask Questions