Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Phil Blunsom

Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Jun 15, 2014
Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas

Figure 1 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Figure 2 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Figure 3 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Figure 4 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced semantics. Our model is based on an extended Dynamic Convolution Neural Network, which learns convolution filters at both the sentence and document level, hierarchically learning to capture and compose low level lexical features into high level semantic concepts. We demonstrate the effectiveness of this model on a range of document modelling tasks, achieving strong results with no feature engineering and with a more compact model. Inspired by recent advances in visualising deep convolution networks for computer vision, we present a novel visualisation technique for our document networks which not only provides insight into their learning process, but also can be interpreted to produce a compelling automatic summarisation system for texts.

Via

Access Paper or Ask Questions

Compositional Morphology for Word Representations and Language Modelling

May 16, 2014
Jan A. Botha, Phil Blunsom

Figure 1 for Compositional Morphology for Word Representations and Language Modelling

Figure 2 for Compositional Morphology for Word Representations and Language Modelling

Figure 3 for Compositional Morphology for Word Representations and Language Modelling

Figure 4 for Compositional Morphology for Word Representations and Language Modelling

This paper presents a scalable method for integrating compositional morphological representations into a vector-based probabilistic language model. Our approach is evaluated in the context of log-bilinear language models, rendered suitably efficient for implementation inside a machine translation decoder by factoring the vocabulary. We perform both intrinsic and extrinsic evaluations, presenting results on a range of languages which demonstrate that our model learns morphological representations that both perform well on word similarity tasks and lead to substantial reductions in perplexity. When used for translation into morphologically rich languages with large vocabularies, our models obtain improvements of up to 1.2 BLEU points relative to a baseline system using back-off n-gram models.

* Proceedings of the 31st International Conference on Machine Learning (ICML)

Via

Access Paper or Ask Questions

Learning Bilingual Word Representations by Marginalizing Alignments

May 05, 2014
Tomáš Kočiský, Karl Moritz Hermann, Phil Blunsom

Figure 1 for Learning Bilingual Word Representations by Marginalizing Alignments

Figure 2 for Learning Bilingual Word Representations by Marginalizing Alignments

Figure 3 for Learning Bilingual Word Representations by Marginalizing Alignments

Figure 4 for Learning Bilingual Word Representations by Marginalizing Alignments

We present a probabilistic model that simultaneously learns alignments and distributed representations for bilingual data. By marginalizing over word alignments the model captures a larger semantic context than prior work relying on hard alignments. The advantage of this approach is demonstrated in a cross-lingual classification task, where we outperform the prior published state of the art.

* Proceedings of ACL 2014 (Short Papers)

Via

Access Paper or Ask Questions

A Deep Architecture for Semantic Parsing

Apr 29, 2014
Edward Grefenstette, Phil Blunsom, Nando de Freitas, Karl Moritz Hermann

Figure 1 for A Deep Architecture for Semantic Parsing

Figure 2 for A Deep Architecture for Semantic Parsing

Figure 3 for A Deep Architecture for Semantic Parsing

Figure 4 for A Deep Architecture for Semantic Parsing

Many successful approaches to semantic parsing build on top of the syntactic analysis of text, and make use of distributional representations or statistical models to match parses to ontology-specific queries. This paper presents a novel deep learning architecture which provides a semantic parsing system through the union of two neural models of language semantics. It allows for the generation of ontology-specific queries from natural language statements and questions without the need for parsing, which makes it especially suitable to grammatically malformed or syntactically atypical text, such as tweets, as well as permitting the development of semantic parsers for resource-poor languages.

* In Proceedings of the Semantic Parsing Workshop at ACL 2014 (forthcoming)

Via

Access Paper or Ask Questions

Multilingual Models for Compositional Distributed Semantics

Apr 17, 2014
Karl Moritz Hermann, Phil Blunsom

Figure 1 for Multilingual Models for Compositional Distributed Semantics

Figure 2 for Multilingual Models for Compositional Distributed Semantics

Figure 3 for Multilingual Models for Compositional Distributed Semantics

Figure 4 for Multilingual Models for Compositional Distributed Semantics

We present a novel technique for learning semantic representations, which extends the distributional hypothesis to multilingual data and joint-space embeddings. Our models leverage parallel data and learn to strongly align the embeddings of semantically equivalent sentences, while maintaining sufficient distance between those of dissimilar sentences. The models do not rely on word alignments or any syntactic information and are successfully applied to a number of diverse languages. We extend our approach to learn semantic representations at the document level, too. We evaluate these models on two cross-lingual document classification tasks, outperforming the prior state of the art. Through qualitative analysis and the study of pivoting effects we demonstrate that our representations are semantically plausible and can capture semantic relationships across languages without parallel data.

* Proceedings of ACL 2014 (Long papers)

Via

Access Paper or Ask Questions

A Convolutional Neural Network for Modelling Sentences

Apr 08, 2014
Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom

Figure 1 for A Convolutional Neural Network for Modelling Sentences

Figure 2 for A Convolutional Neural Network for Modelling Sentences

Figure 3 for A Convolutional Neural Network for Modelling Sentences

Figure 4 for A Convolutional Neural Network for Modelling Sentences

The ability to accurately represent sentences is central to language understanding. We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences. The network uses Dynamic k-Max Pooling, a global pooling operation over linear sequences. The network handles input sentences of varying length and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. The network does not rely on a parse tree and is easily applicable to any language. We test the DCNN in four experiments: small scale binary and multi-class sentiment prediction, six-way question classification and Twitter sentiment prediction by distant supervision. The network achieves excellent performance in the first three tasks and a greater than 25% error reduction in the last task with respect to the strongest baseline.

Via

Access Paper or Ask Questions

Multilingual Distributed Representations without Word Alignment

Mar 20, 2014
Karl Moritz Hermann, Phil Blunsom

Figure 1 for Multilingual Distributed Representations without Word Alignment

Figure 2 for Multilingual Distributed Representations without Word Alignment

Figure 3 for Multilingual Distributed Representations without Word Alignment

Figure 4 for Multilingual Distributed Representations without Word Alignment

Distributed representations of meaning are a natural way to encode covariance relationships between words and phrases in NLP. By overcoming data sparsity problems, as well as providing information about semantic relatedness which is not available in discrete representations, distributed representations have proven useful in many NLP tasks. Recent work has shown how compositional semantic representations can successfully be applied to a number of monolingual applications such as sentiment analysis. At the same time, there has been some initial success in work on learning shared word-level representations across languages. We combine these two approaches by proposing a method for learning distributed representations in a multilingual setup. Our model learns to assign similar embeddings to aligned sentences and dissimilar ones to sentence which are not aligned while not requiring word alignments. We show that our representations are semantically informative and apply them to a cross-lingual document classification task where we outperform the previous state of the art. Further, by employing parallel corpora of multiple language pairs we find that our model learns representations that capture semantic relationships across languages for which no parallel data was used.

* To appear at ICLR 2014

Via

Access Paper or Ask Questions

Modelling the Lexicon in Unsupervised Part of Speech Induction

Feb 26, 2014
Greg Dubbin, Phil Blunsom

Figure 1 for Modelling the Lexicon in Unsupervised Part of Speech Induction

Figure 2 for Modelling the Lexicon in Unsupervised Part of Speech Induction

Figure 3 for Modelling the Lexicon in Unsupervised Part of Speech Induction

Figure 4 for Modelling the Lexicon in Unsupervised Part of Speech Induction

Automatically inducing the syntactic part-of-speech categories for words in text is a fundamental task in Computational Linguistics. While the performance of unsupervised tagging models has been slowly improving, current state-of-the-art systems make the obviously incorrect assumption that all tokens of a given word type must share a single part-of-speech tag. This one-tag-per-type heuristic counters the tendency of Hidden Markov Model based taggers to over generate tags for a given word type. However, it is clearly incompatible with basic syntactic theory. In this paper we extend a state-of-the-art Pitman-Yor Hidden Markov Model tagger with an explicit model of the lexicon. In doing so we are able to incorporate a soft bias towards inducing few tags per type. We develop a particle filter for drawing samples from the posterior of our model and present empirical results that show that our model is competitive with and faster than the state-of-the-art without making any unrealistic restrictions.

* To be presented at the 14th Conference of the European Chapter of the Association for Computational Linguistics

Via

Access Paper or Ask Questions

Recurrent Convolutional Neural Networks for Discourse Compositionality

Jun 15, 2013
Nal Kalchbrenner, Phil Blunsom

Figure 1 for Recurrent Convolutional Neural Networks for Discourse Compositionality

Figure 2 for Recurrent Convolutional Neural Networks for Discourse Compositionality

Figure 3 for Recurrent Convolutional Neural Networks for Discourse Compositionality

Figure 4 for Recurrent Convolutional Neural Networks for Discourse Compositionality

The compositionality of meaning extends beyond the single sentence. Just as words combine to form the meaning of sentences, so do sentences combine to form the meaning of paragraphs, dialogues and general discourse. We introduce both a sentence model and a discourse model corresponding to the two levels of compositionality. The sentence model adopts convolution as the central operation for composing semantic vectors and is based on a novel hierarchical convolutional neural network. The discourse model extends the sentence model and is based on a recurrent neural network that is conditioned in a novel way both on the current sentence and on the current speaker. The discourse model is able to capture both the sequentiality of sentences and the interaction between different speakers. Without feature engineering or pretraining and with simple greedy decoding, the discourse model coupled to the sentence model obtains state of the art performance on a dialogue act classification experiment.

Via

Access Paper or Ask Questions

"Not not bad" is not "bad": A distributional account of negation

Jun 10, 2013
Karl Moritz Hermann, Edward Grefenstette, Phil Blunsom

Figure 1 for "Not not bad" is not "bad": A distributional account of negation

Figure 2 for "Not not bad" is not "bad": A distributional account of negation

Figure 3 for "Not not bad" is not "bad": A distributional account of negation

With the increasing empirical success of distributional models of compositional semantics, it is timely to consider the types of textual logic that such models are capable of capturing. In this paper, we address shortcomings in the ability of current models to capture logical operations such as negation. As a solution we propose a tripartite formulation for a continuous vector space representation of semantics and subsequently use this representation to develop a formal compositional notion of negation within such models.

* 9 pages, to appear in Proceedings of the 2013 Workshop on Continuous Vector Space Models and their Compositionality

Via

Access Paper or Ask Questions