Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Karl Moritz Hermann

Semantic Parsing with Semi-Supervised Sequential Autoencoders

Sep 29, 2016

Tomáš Kočiský, Gábor Melis, Edward Grefenstette, Chris Dyer, Wang Ling, Phil Blunsom, Karl Moritz Hermann

Figure 1 for Semantic Parsing with Semi-Supervised Sequential Autoencoders

Figure 2 for Semantic Parsing with Semi-Supervised Sequential Autoencoders

Figure 3 for Semantic Parsing with Semi-Supervised Sequential Autoencoders

Figure 4 for Semantic Parsing with Semi-Supervised Sequential Autoencoders

Abstract:We present a novel semi-supervised approach for sequence transduction and apply it to semantic parsing. The unsupervised component is based on a generative model in which latent sentences generate the unpaired logical forms. We apply this method to a number of semantic parsing tasks focusing on domains with limited access to labelled training data and extend those datasets with synthetically generated logical forms.

Via

Access Paper or Ask Questions

Latent Predictor Networks for Code Generation

Jun 08, 2016

Wang Ling, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Andrew Senior, Fumin Wang, Phil Blunsom

Figure 1 for Latent Predictor Networks for Code Generation

Figure 2 for Latent Predictor Networks for Code Generation

Figure 3 for Latent Predictor Networks for Code Generation

Figure 4 for Latent Predictor Networks for Code Generation

Abstract:Many language generation tasks require the production of text conditioned on both structured and unstructured inputs. We present a novel neural network architecture which generates an output sequence conditioned on an arbitrary number of input functions. Crucially, our approach allows both the choice of conditioning context and the granularity of generation, for example characters or tokens, to be marginalised, thus permitting scalable and effective training. Using this framework, we address the problem of generating programming code from a mixed natural language and structured specification. We create two new data sets for this paradigm derived from the collectible trading card games Magic the Gathering and Hearthstone. On these, and a third preexisting corpus, we demonstrate that marginalising multiple predictors allows our model to outperform strong benchmarks.

Via

Access Paper or Ask Questions

Reasoning about Entailment with Neural Attention

Mar 01, 2016

Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom

Figure 1 for Reasoning about Entailment with Neural Attention

Figure 2 for Reasoning about Entailment with Neural Attention

Figure 3 for Reasoning about Entailment with Neural Attention

Figure 4 for Reasoning about Entailment with Neural Attention

Abstract:While most approaches to automatically recognizing entailment relations have used classifiers employing hand engineered features derived from complex natural language processing pipelines, in practice their performance has been only slightly better than bag-of-word pair classifiers using only lexical similarity. The only attempt so far to build an end-to-end differentiable neural network for entailment failed to outperform such a simple similarity classifier. In this paper, we propose a neural model that reads two sentences to determine entailment using long short-term memory units. We extend this model with a word-by-word neural attention mechanism that encourages reasoning over entailments of pairs of words and phrases. Furthermore, we present a qualitative analysis of attention weights produced by this model, demonstrating such reasoning capabilities. On a large entailment dataset this model outperforms the previous best neural model and a classifier with engineered features by a substantial margin. It is the first generic end-to-end differentiable system that achieves state-of-the-art accuracy on a textual entailment dataset.

* ICLR 2016 camera-ready, 9 pages, 10 figures (incl. subfigures)

Via

Access Paper or Ask Questions

Teaching Machines to Read and Comprehend

Nov 19, 2015

Karl Moritz Hermann, Tomáš Kočiský, Edward Grefenstette, Lasse Espeholt, Will Kay, Mustafa Suleyman, Phil Blunsom

Figure 1 for Teaching Machines to Read and Comprehend

Figure 2 for Teaching Machines to Read and Comprehend

Figure 3 for Teaching Machines to Read and Comprehend

Figure 4 for Teaching Machines to Read and Comprehend

Abstract:Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.

* Appears in: Advances in Neural Information Processing Systems 28 (NIPS 2015). 14 pages, 13 figures

Via

Access Paper or Ask Questions

Learning to Transduce with Unbounded Memory

Nov 03, 2015

Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil Blunsom

Figure 1 for Learning to Transduce with Unbounded Memory

Figure 2 for Learning to Transduce with Unbounded Memory

Figure 3 for Learning to Transduce with Unbounded Memory

Figure 4 for Learning to Transduce with Unbounded Memory

Abstract:Recently, strong results have been demonstrated by Deep Recurrent Neural Networks on natural language transduction problems. In this paper we explore the representational power of these models using synthetic grammars designed to exhibit phenomena similar to those found in real transduction problems such as machine translation. These experiments lead us to propose new memory-based recurrent networks that implement continuously differentiable analogues of traditional data structures such as Stacks, Queues, and DeQues. We show that these architectures exhibit superior generalisation performance to Deep RNNs and are often able to learn the underlying generating algorithms in our transduction experiments.

* 14 pages, 4 figures, NIPS 2015

Via

Access Paper or Ask Questions

Deep Learning for Answer Sentence Selection

Dec 04, 2014

Lei Yu, Karl Moritz Hermann, Phil Blunsom, Stephen Pulman

Figure 1 for Deep Learning for Answer Sentence Selection

Figure 2 for Deep Learning for Answer Sentence Selection

Figure 3 for Deep Learning for Answer Sentence Selection

Figure 4 for Deep Learning for Answer Sentence Selection

Abstract:Answer sentence selection is the task of identifying sentences that contain the answer to a given question. This is an important problem in its own right as well as in the larger context of open domain question answering. We propose a novel approach to solving this task via means of distributed representations, and learn to match questions with answers by considering their semantic encoding. This contrasts prior work on this task, which typically relies on classifiers with large numbers of hand-crafted syntactic and semantic features and various external resources. Our approach does not require any feature engineering nor does it involve specialist linguistic data, making this model easily applicable to a wide range of domains and languages. Experimental results on a standard benchmark dataset from TREC demonstrate that---despite its simplicity---our model matches state of the art performance on the answer sentence selection task.

* 9 pages, accepted by NIPS deep learning workshop

Via

Access Paper or Ask Questions

Distributed Representations for Compositional Semantics

Nov 12, 2014

Karl Moritz Hermann

Figure 1 for Distributed Representations for Compositional Semantics

Figure 2 for Distributed Representations for Compositional Semantics

Figure 3 for Distributed Representations for Compositional Semantics

Figure 4 for Distributed Representations for Compositional Semantics

Abstract:The mathematical representation of semantics is a key issue for Natural Language Processing (NLP). A lot of research has been devoted to finding ways of representing the semantics of individual words in vector spaces. Distributional approaches --- meaning distributed representations that exploit co-occurrence statistics of large corpora --- have proved popular and successful across a number of tasks. However, natural language usually comes in structures beyond the word level, with meaning arising not only from the individual words but also the structure they are contained in at the phrasal or sentential level. Modelling the compositional process by which the meaning of an utterance arises from the meaning of its parts is an equally fundamental task of NLP. This dissertation explores methods for learning distributed semantic representations and models for composing these into representations for larger linguistic units. Our underlying hypothesis is that neural models are a suitable vehicle for learning semantically rich representations and that such representations in turn are suitable vehicles for solving important tasks in natural language processing. The contribution of this thesis is a thorough evaluation of our hypothesis, as part of which we introduce several new approaches to representation learning and compositional semantics, as well as multiple state-of-the-art models which apply distributed semantic representations to various tasks in NLP.

* DPhil Thesis, University of Oxford, Submitted and accepted in 2014

Via

Access Paper or Ask Questions

Learning Bilingual Word Representations by Marginalizing Alignments

May 05, 2014

Tomáš Kočiský, Karl Moritz Hermann, Phil Blunsom

Figure 1 for Learning Bilingual Word Representations by Marginalizing Alignments

Figure 2 for Learning Bilingual Word Representations by Marginalizing Alignments

Figure 3 for Learning Bilingual Word Representations by Marginalizing Alignments

Figure 4 for Learning Bilingual Word Representations by Marginalizing Alignments

Abstract:We present a probabilistic model that simultaneously learns alignments and distributed representations for bilingual data. By marginalizing over word alignments the model captures a larger semantic context than prior work relying on hard alignments. The advantage of this approach is demonstrated in a cross-lingual classification task, where we outperform the prior published state of the art.

* Proceedings of ACL 2014 (Short Papers)

Via

Access Paper or Ask Questions

A Deep Architecture for Semantic Parsing

Apr 29, 2014

Edward Grefenstette, Phil Blunsom, Nando de Freitas, Karl Moritz Hermann

Figure 1 for A Deep Architecture for Semantic Parsing

Figure 2 for A Deep Architecture for Semantic Parsing

Figure 3 for A Deep Architecture for Semantic Parsing

Figure 4 for A Deep Architecture for Semantic Parsing

Abstract:Many successful approaches to semantic parsing build on top of the syntactic analysis of text, and make use of distributional representations or statistical models to match parses to ontology-specific queries. This paper presents a novel deep learning architecture which provides a semantic parsing system through the union of two neural models of language semantics. It allows for the generation of ontology-specific queries from natural language statements and questions without the need for parsing, which makes it especially suitable to grammatically malformed or syntactically atypical text, such as tweets, as well as permitting the development of semantic parsers for resource-poor languages.

* In Proceedings of the Semantic Parsing Workshop at ACL 2014 (forthcoming)

Via

Access Paper or Ask Questions

Multilingual Models for Compositional Distributed Semantics

Apr 17, 2014

Karl Moritz Hermann, Phil Blunsom

Figure 1 for Multilingual Models for Compositional Distributed Semantics

Figure 2 for Multilingual Models for Compositional Distributed Semantics

Figure 3 for Multilingual Models for Compositional Distributed Semantics

Figure 4 for Multilingual Models for Compositional Distributed Semantics

Abstract:We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of semantically equivalent sentences, while maintaining sufficient distance between those of dissimilar sentences. The models do not rely on word alignments or any syntactic information and are successfully applied to a number of diverse languages. We extend our approach to learn semantic representations at the document level, too. We evaluate these models on two cross-lingual document classification tasks, outperforming the prior state of the art. Through qualitative analysis and the study of pivoting effects we demonstrate that our representations are semantically plausible and can capture semantic relationships across languages without parallel data.

* Proceedings of ACL 2014 (Long papers)

Via

Access Paper or Ask Questions