Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Elizaveta Kuzmenko

Representing ELMo embeddings as two-dimensional text online

Mar 30, 2021

Andrey Kutuzov, Elizaveta Kuzmenko

Figure 1 for Representing ELMo embeddings as two-dimensional text online

Figure 2 for Representing ELMo embeddings as two-dimensional text online

Figure 3 for Representing ELMo embeddings as two-dimensional text online

Figure 4 for Representing ELMo embeddings as two-dimensional text online

Abstract:We describe a new addition to the WebVectors toolkit which is used to serve word embedding models over the Web. The new ELMoViz module adds support for contextualized embedding architectures, in particular for ELMo models. The provided visualizations follow the metaphor of `two-dimensional text' by showing lexical substitutes: words which are most semantically similar in context to the words of the input sentence. The system allows the user to change the ELMo layers from which token embeddings are inferred. It also conveys corpus information about the query words and their lexical substitutes (namely their frequency tiers and parts of speech). The module is well integrated into the rest of the WebVectors toolkit, providing lexical hyperlinks to word representations in static embedding models. Two web services have already implemented the new functionality with pre-trained ELMo models for Russian, Norwegian and English.

* EACL'2021 demo paper

Via

Access Paper or Ask Questions

To lemmatize or not to lemmatize: how word normalisation affects ELMo performance in word sense disambiguation

Sep 06, 2019

Andrey Kutuzov, Elizaveta Kuzmenko

Figure 1 for To lemmatize or not to lemmatize: how word normalisation affects ELMo performance in word sense disambiguation

Figure 2 for To lemmatize or not to lemmatize: how word normalisation affects ELMo performance in word sense disambiguation

Figure 3 for To lemmatize or not to lemmatize: how word normalisation affects ELMo performance in word sense disambiguation

Figure 4 for To lemmatize or not to lemmatize: how word normalisation affects ELMo performance in word sense disambiguation

Abstract:We critically evaluate the widespread assumption that deep learning NLP models do not require lemmatized input. To test this, we trained versions of contextualised word embedding ELMo models on raw tokenized corpora and on the corpora with word tokens replaced by their lemmas. Then, these models were evaluated on the word sense disambiguation task. This was done for the English and Russian languages. The experiments showed that while lemmatization is indeed not necessary for English, the situation is different for Russian. It seems that for rich-morphology languages, using lemmatized training and testing data yields small but consistent improvements: at least for word sense disambiguation. This means that the decisions about text pre-processing before training ELMo should consider the linguistic nature of the language in question.

* Accepted to NODALIDA2019 Deep Learning for Natural Language Processing workshop

Via

Access Paper or Ask Questions