Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eneko Agirre

Unsupervised Statistical Machine Translation

Sep 04, 2018
Mikel Artetxe, Gorka Labaka, Eneko Agirre

Figure 1 for Unsupervised Statistical Machine Translation

Figure 2 for Unsupervised Statistical Machine Translation

Figure 3 for Unsupervised Statistical Machine Translation

Figure 4 for Unsupervised Statistical Machine Translation

While modern machine translation has relied on large parallel corpora, a recent line of work has managed to train Neural Machine Translation (NMT) systems from monolingual corpora only (Artetxe et al., 2018c; Lample et al., 2018). Despite the potential of this approach for low-resource settings, existing systems are far behind their supervised counterparts, limiting their practical interest. In this paper, we propose an alternative approach based on phrase-based Statistical Machine Translation (SMT) that significantly closes the gap with supervised systems. Our method profits from the modular architecture of SMT: we first induce a phrase table from monolingual corpora through cross-lingual embedding mappings, combine it with an n-gram language model, and fine-tune hyperparameters through an unsupervised MERT variant. In addition, iterative backtranslation improves results further, yielding, for instance, 14.08 and 26.22 BLEU points in WMT 2014 English-German and English-French, respectively, an improvement of more than 7-10 BLEU points over previous unsupervised systems, and closing the gap with supervised SMT (Moses trained on Europarl) down to 2-5 BLEU points. Our implementation is available at https://github.com/artetxem/monoses

* EMNLP 2018

Via

Access Paper or Ask Questions

A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

May 17, 2018
Mikel Artetxe, Gorka Labaka, Eneko Agirre

Figure 1 for A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

Figure 2 for A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

Figure 3 for A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

Figure 4 for A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings

Recent work has managed to learn cross-lingual word embeddings without parallel data by mapping monolingual embeddings to a shared space through adversarial training. However, their evaluation has focused on favorable conditions, using comparable corpora or closely-related languages, and we show that they often fail in more realistic scenarios. This work proposes an alternative approach based on a fully unsupervised initialization that explicitly exploits the structural similarity of the embeddings, and a robust self-learning algorithm that iteratively improves this solution. Our method succeeds in all tested scenarios and obtains the best published results in standard datasets, even surpassing previous supervised systems. Our implementation is released as an open source project at https://github.com/artetxem/vecmap

* ACL 2018

Via

Access Paper or Ask Questions

The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD

May 11, 2018
Eneko Agirre, Oier López de Lacalle, Aitor Soroa

Figure 1 for The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD

Figure 2 for The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD

Figure 3 for The risk of sub-optimal use of Open Source NLP Software: UKB is inadvertently state-of-the-art in knowledge-based WSD

UKB is an open source collection of programs for performing, among other tasks, knowledge-based Word Sense Disambiguation (WSD). Since it was released in 2009 it has been often used out-of-the-box in sub-optimal settings. We show that nine years later it is the state-of-the-art on knowledge-based WSD. This case shows the pitfalls of releasing open source NLP software without optimal default settings and precise instructions for reproducibility.

Via

Access Paper or Ask Questions

Unsupervised Neural Machine Translation

Feb 26, 2018
Mikel Artetxe, Gorka Labaka, Eneko Agirre, Kyunghyun Cho

Figure 1 for Unsupervised Neural Machine Translation

Figure 2 for Unsupervised Neural Machine Translation

Figure 3 for Unsupervised Neural Machine Translation

In spite of the recent success of neural machine translation (NMT) in standard benchmarks, the lack of large parallel corpora poses a major practical problem for many language pairs. There have been several proposals to alleviate this issue with, for instance, triangulation and semi-supervised learning techniques, but they still require a strong cross-lingual signal. In this work, we completely remove the need of parallel data and propose a novel method to train an NMT system in a completely unsupervised manner, relying on nothing but monolingual corpora. Our model builds upon the recent work on unsupervised embedding mappings, and consists of a slightly modified attentional encoder-decoder model that can be trained on monolingual corpora alone using a combination of denoising and backtranslation. Despite the simplicity of the approach, our system obtains 15.56 and 10.21 BLEU points in WMT 2014 French-to-English and German-to-English translation. The model can also profit from small parallel corpora, and attains 21.81 and 15.24 points when combined with 100,000 parallel sentences, respectively. Our implementation is released as an open source project.

* Published as a conference paper at ICLR 2018

Via

Access Paper or Ask Questions

SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Jul 31, 2017
Daniel Cer, Mona Diab, Eneko Agirre, Iñigo Lopez-Gazpio, Lucia Specia

Figure 1 for SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Figure 2 for SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Figure 3 for SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Figure 4 for SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation

Semantic Textual Similarity (STS) measures the meaning similarity of sentences. Applications include machine translation (MT), summarization, generation, question answering (QA), short answer grading, semantic search, dialog and conversational systems. The STS shared task is a venue for assessing the current state-of-the-art. The 2017 task focuses on multilingual and cross-lingual pairs with one sub-track exploring MT quality estimation (MTQE) data. The task obtained strong participation from 31 teams, with 17 participating in all language tracks. We summarize performance and review a selection of well performing methods. Analysis highlights common errors, providing insight into the limitations of existing models. To support ongoing work on semantic representations, the STS Benchmark is introduced as a new shared training and evaluation set carefully selected from the corpus of English STS shared task data (2012-2017).

* To appear in proceedings of the SemEval workshop at ACL 2017; 14 pages, 14 Tables, 1 Figure

Via

Access Paper or Ask Questions

Evaluating the word-expert approach for Named-Entity Disambiguation

Mar 15, 2016
Angel X. Chang, Valentin I. Spitkovsky, Christopher D. Manning, Eneko Agirre

Figure 1 for Evaluating the word-expert approach for Named-Entity Disambiguation

Figure 2 for Evaluating the word-expert approach for Named-Entity Disambiguation

Figure 3 for Evaluating the word-expert approach for Named-Entity Disambiguation

Figure 4 for Evaluating the word-expert approach for Named-Entity Disambiguation

Named Entity Disambiguation (NED) is the task of linking a named-entity mention to an instance in a knowledge-base, typically Wikipedia. This task is closely related to word-sense disambiguation (WSD), where the supervised word-expert approach has prevailed. In this work we present the results of the word-expert approach to NED, where one classifier is built for each target entity mention string. The resources necessary to build the system, a dictionary and a set of training instances, have been automatically derived from Wikipedia. We provide empirical evidence of the value of this approach, as well as a study of the differences between WSD and NED, including ambiguity and synonymy statistics.

Via

Access Paper or Ask Questions

Improving distant supervision using inference learning

Sep 12, 2015
Roland Roller, Eneko Agirre, Aitor Soroa, Mark Stevenson

Figure 1 for Improving distant supervision using inference learning

Figure 2 for Improving distant supervision using inference learning

Figure 3 for Improving distant supervision using inference learning

Figure 4 for Improving distant supervision using inference learning

Distant supervision is a widely applied approach to automatic training of relation extraction systems and has the advantage that it can generate large amounts of labelled data with minimal effort. However, this data may contain errors and consequently systems trained using distant supervision tend not to perform as well as those based on manually labelled data. This work proposes a novel method for detecting potential false negative training examples using a knowledge inference method. Results show that our approach improves the performance of relation extraction systems trained using distantly supervised data.

* In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

Via

Access Paper or Ask Questions

Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation

Mar 12, 2015
Eneko Agirre, Ander Barrena, Aitor Soroa

Figure 1 for Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation

Figure 2 for Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation

Figure 3 for Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation

Figure 4 for Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation

Hyperlinks and other relations in Wikipedia are a extraordinary resource which is still not fully understood. In this paper we study the different types of links in Wikipedia, and contrast the use of the full graph with respect to just direct links. We apply a well-known random walk algorithm on two tasks, word relatedness and named-entity disambiguation. We show that using the full graph is more effective than just direct links by a large margin, that non-reciprocal links harm performance, and that there is no benefit from categories and infoboxes, with coherent results on both tasks. We set new state-of-the-art figures for systems based on Wikipedia links, comparable to systems exploiting several information sources and/or supervised machine learning. Our approach is open source, with instruction to reproduce results, and amenable to be integrated with complementary text-based methods.

Via

Access Paper or Ask Questions

The Basque task: did systems perform in the upperbound?

Apr 12, 2002
Eneko Agirre, Elena Garcia, Mikel Lersundi, David Martinez, Eli Pociello

Figure 1 for The Basque task: did systems perform in the upperbound?

In this paper we describe the Senseval 2 Basque lexical-sample task. The task comprised 40 words (15 nouns, 15 verbs and 10 adjectives) selected from Euskal Hiztegia, the main Basque dictionary. Most examples were taken from the Egunkaria newspaper. The method used to hand-tag the examples produced low inter-tagger agreement (75%) before arbitration. The four competing systems attained results well above the most frequent baseline and the best system scored 75% precision at 100% coverage. The paper includes an analysis of the tagging procedure used, as well as the performance of the competing systems. In particular, we argue that inter-tagger agreement is not a real upperbound for the Basque WSD task.

* Proceedings of the SENSEVAL-2 Workshop. In conjunction with ACL'2001/EACL'2001. Toulouse
* 4 pages

Via

Access Paper or Ask Questions

Decision Lists for English and Basque

Apr 12, 2002
Eneko Agirre, David Martinez

Figure 1 for Decision Lists for English and Basque

In this paper we describe the systems we developed for the English (lexical and all-words) and Basque tasks. They were all supervised systems based on Yarowsky's Decision Lists. We used Semcor for training in the English all-words task. We defined different feature sets for each language. For Basque, in order to extract all the information from the text, we defined features that have not been used before in the literature, using a morphological analyzer. We also implemented systems that selected automatically good features and were able to obtain a prefixed precision (85%) at the cost of coverage. The systems that used all the features were identified as BCU-ehu-dlist-all and the systems that selected some features as BCU-ehu-dlist-best.

* Proceedings of the SENSEVAL-2 Workshop. In conjunction with ACL'2001/EACL'2001. Toulouse
* 4 pages

Via

Access Paper or Ask Questions