Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adam Lopez

Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

May 18, 2020
Kate McCurdy, Sharon Goldwater, Adam Lopez

Figure 1 for Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

Figure 2 for Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

Figure 3 for Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

Figure 4 for Inflecting when there's no majority: Limitations of encoder-decoder neural networks as cognitive models for German plurals

Can artificial neural networks learn to represent inflectional morphology and generalize to new words as human speakers do? Kirov and Cotterell (2018) argue that the answer is yes: modern Encoder-Decoder (ED) architectures learn human-like behavior when inflecting English verbs, such as extending the regular past tense form -(e)d to novel words. However, their work does not address the criticism raised by Marcus et al. (1995): that neural models may learn to extend not the regular, but the most frequent class -- and thus fail on tasks like German number inflection, where infrequent suffixes like -s can still be productively generalized. To investigate this question, we first collect a new dataset from German speakers (production and ratings of plural forms for novel nouns) that is designed to avoid sources of information unavailable to the ED model. The speaker data show high variability, and two suffixes evince 'regular' behavior, appearing more often with phonologically atypical inputs. Encoder-decoder models do generalize the most frequently produced plural class, but do not show human-like variability or 'regular' extension of these other plural markers. We conclude that modern neural models may still struggle with minority-class generalization.

* To appear at ACL 2020

Via

Access Paper or Ask Questions

Word Interdependence Exposes How LSTMs Compose Representations

Apr 27, 2020
Naomi Saphra, Adam Lopez

Figure 1 for Word Interdependence Exposes How LSTMs Compose Representations

Figure 2 for Word Interdependence Exposes How LSTMs Compose Representations

Figure 3 for Word Interdependence Exposes How LSTMs Compose Representations

Figure 4 for Word Interdependence Exposes How LSTMs Compose Representations

Recent work in NLP shows that LSTM language models capture compositional structure in language data. For a closer look at how these representations are composed hierarchically, we present a novel measure of interdependence between word meanings in an LSTM, based on their interactions at the internal gates. To explore how compositional representations arise over training, we conduct simple experiments on synthetic data, which illustrate our measure by showing how high interdependence can hurt generalization. These synthetic experiments also illustrate a specific hypothesis about how hierarchical structures are discovered over the course of training: that parent constituents rely on effective representations of their children, rather than on learning long-range relations independently. We further support this measure with experiments on English language data, where interdependence is higher for more closely syntactically linked word pairs.

Via

Access Paper or Ask Questions

How to Evaluate Word Representations of Informal Domain?

Nov 13, 2019
Yekun Chai, Naomi Saphra, Adam Lopez

Figure 1 for How to Evaluate Word Representations of Informal Domain?

Figure 2 for How to Evaluate Word Representations of Informal Domain?

Figure 3 for How to Evaluate Word Representations of Informal Domain?

Figure 4 for How to Evaluate Word Representations of Informal Domain?

Diverse word representations have surged in most state-of-the-art natural language processing (NLP) applications. Nevertheless, how to efficiently evaluate such word embeddings in the informal domain such as Twitter or forums, remains an ongoing challenge due to the lack of sufficient evaluation dataset. We derived a large list of variant spelling pairs from UrbanDictionary with the automatic approaches of weakly-supervised pattern-based bootstrapping and self-training linear-chain conditional random field (CRF). With these extracted relation pairs we promote the odds of eliding the text normalization procedure of traditional NLP pipelines and directly adopting representations of non-standard words in the informal domain. Our code is available.

Via

Access Paper or Ask Questions

Semantic Graph Parsing with Recurrent Neural Network DAG Grammars

Oct 20, 2019
Federico Fancellu, Sorcha Gilroy, Adam Lopez, Mirella Lapata

Figure 1 for Semantic Graph Parsing with Recurrent Neural Network DAG Grammars

Figure 2 for Semantic Graph Parsing with Recurrent Neural Network DAG Grammars

Figure 3 for Semantic Graph Parsing with Recurrent Neural Network DAG Grammars

Figure 4 for Semantic Graph Parsing with Recurrent Neural Network DAG Grammars

Semantic parses are directed acyclic graphs (DAGs), so semantic parsing should be modeled as graph prediction. But predicting graphs presents difficult technical challenges, so it is simpler and more common to predict the linearized graphs found in semantic parsing datasets using well-understood sequence models. The cost of this simplicity is that the predicted strings may not be well-formed graphs. We present recurrent neural network DAG grammars, a graph-aware sequence model that ensures only well-formed graphs while sidestepping many difficulties in graph prediction. We test our model on the Parallel Meaning Bank---a multilingual semantic graphbank. Our approach yields competitive results in English and establishes the first results for German, Italian and Dutch.

* 9 pages, to appear in EMNLP2019

Via

Access Paper or Ask Questions

A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages

Sep 06, 2019
Clara Vania, Yova Kementchedjhieva, Anders Søgaard, Adam Lopez

Figure 1 for A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages

Figure 2 for A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages

Figure 3 for A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages

Figure 4 for A systematic comparison of methods for low-resource dependency parsing on genuinely low-resource languages

Parsers are available for only a handful of the world's languages, since they require lots of training data. How far can we get with just a small amount of training data? We systematically compare a set of simple strategies for improving low-resource parsers: data augmentation, which has not been tested before; cross-lingual training; and transliteration. Experimenting on three typologically diverse low-resource languages---North S\'ami, Galician, and Kazah---We find that (1) when only the low-resource treebank is available, data augmentation is very helpful; (2) when a related high-resource treebank is available, cross-lingual training is helpful and complements data augmentation; and (3) when the high-resource treebank uses a different writing system, transliteration into a shared orthographic spaces is also very helpful.

* EMNLP 2019

Via

Access Paper or Ask Questions

Classifying topics in speech when all you have is crummy translations

Aug 29, 2019
Sameer Bansal, Herman Kamper, Adam Lopez, Sharon Goldwater

Figure 1 for Classifying topics in speech when all you have is crummy translations

Figure 2 for Classifying topics in speech when all you have is crummy translations

Figure 3 for Classifying topics in speech when all you have is crummy translations

Figure 4 for Classifying topics in speech when all you have is crummy translations

Given a large amount of unannotated speech in a language with few resources, can we classify the speech utterances by topic? We show that this is possible if text translations are available for just a small amount of speech (less than 20 hours), using a recent model for direct speech-to-text translation. While the translations are poor, they are still good enough to correctly classify 1-minute speech segments over 70% of the time - a 20% improvement over a majority-class baseline. Such a system might be useful for humanitarian applications like crisis response, where incoming speech must be quickly assessed for further action.

Via

Access Paper or Ask Questions

Sparsity Emerges Naturally in Neural Language Models

Jul 22, 2019
Naomi Saphra, Adam Lopez

Figure 1 for Sparsity Emerges Naturally in Neural Language Models

Figure 2 for Sparsity Emerges Naturally in Neural Language Models

Figure 3 for Sparsity Emerges Naturally in Neural Language Models

Figure 4 for Sparsity Emerges Naturally in Neural Language Models

Concerns about interpretability, computational resources, and principled inductive priors have motivated efforts to engineer sparse neural models for NLP tasks. If sparsity is important for NLP, might well-trained neural models naturally become roughly sparse? Using the Taxi-Euclidean norm to measure sparsity, we find that frequent input words are associated with concentrated or sparse activations, while frequent target words are associated with dispersed activations but concentrated gradients. We find that gradients associated with function words are more concentrated than the gradients of content words, even controlling for word frequency.

* Published in the ICML 2019 Workshop on Identifying and Understanding Deep Learning Phenomena: https://openreview.net/forum?id=H1ets1h56E

Via

Access Paper or Ask Questions

Understanding Learning Dynamics Of Language Models with SVCCA

Nov 01, 2018
Naomi Saphra, Adam Lopez

Figure 1 for Understanding Learning Dynamics Of Language Models with SVCCA

Figure 2 for Understanding Learning Dynamics Of Language Models with SVCCA

Figure 3 for Understanding Learning Dynamics Of Language Models with SVCCA

Figure 4 for Understanding Learning Dynamics Of Language Models with SVCCA

Recent work has demonstrated that neural language models encode linguistic structure implicitly in a number of ways. However, existing research has not shed light on the process by which this structure is acquired during training. We use SVCCA as a tool for understanding how a language model is implicitly predicting a variety of word cluster tags. We present experiments suggesting that a single recurrent layer of a language model learns linguistic structure in phases. We find, for example, that a language model naturally stabilizes its representation of part of speech earlier than it learns semantic and topic information.

Via

Access Paper or Ask Questions