Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nal Kalchbrenner

Conditional Image Generation with PixelCNN Decoders

Jun 18, 2016

Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu

Figure 1 for Conditional Image Generation with PixelCNN Decoders

Figure 2 for Conditional Image Generation with PixelCNN Decoders

Figure 3 for Conditional Image Generation with PixelCNN Decoders

Figure 4 for Conditional Image Generation with PixelCNN Decoders

Abstract:This work explores conditional image generation with a new image density model based on the PixelCNN architecture. The model can be conditioned on any vector, including descriptive labels or tags, or latent embeddings created by other networks. When conditioned on class labels from the ImageNet database, the model is able to generate diverse, realistic scenes representing distinct animals, objects, landscapes and structures. When conditioned on an embedding produced by a convolutional network given a single image of an unseen face, it generates a variety of new portraits of the same person with different facial expressions, poses and lighting conditions. We also show that conditional PixelCNN can serve as a powerful decoder in an image autoencoder. Additionally, the gated convolutional layers in the proposed model improve the log-likelihood of PixelCNN to match the state-of-the-art performance of PixelRNN on ImageNet, with greatly reduced computational cost.

Via

Access Paper or Ask Questions

Associative Long Short-Term Memory

May 19, 2016

Ivo Danihelka, Greg Wayne, Benigno Uria, Nal Kalchbrenner, Alex Graves

Figure 1 for Associative Long Short-Term Memory

Figure 2 for Associative Long Short-Term Memory

Figure 3 for Associative Long Short-Term Memory

Figure 4 for Associative Long Short-Term Memory

Abstract:We investigate a new method to augment recurrent neural networks with extra memory without increasing the number of network parameters. The system has an associative memory based on complex-valued vectors and is closely related to Holographic Reduced Representations and Long Short-Term Memory networks. Holographic Reduced Representations have limited capacity: as they store more information, each retrieval becomes noisier due to interference. Our system in contrast creates redundant copies of stored information, which enables retrieval with reduced noise. Experiments demonstrate faster learning on multiple memorization tasks.

* ICML-2016

Via

Access Paper or Ask Questions

Grid Long Short-Term Memory

Jan 07, 2016

Nal Kalchbrenner, Ivo Danihelka, Alex Graves

Figure 1 for Grid Long Short-Term Memory

Figure 2 for Grid Long Short-Term Memory

Figure 3 for Grid Long Short-Term Memory

Figure 4 for Grid Long Short-Term Memory

Abstract:This paper introduces Grid Long Short-Term Memory, a network of LSTM cells arranged in a multidimensional grid that can be applied to vectors, sequences or higher dimensional data such as images. The network differs from existing deep LSTM architectures in that the cells are connected between network layers as well as along the spatiotemporal dimensions of the data. The network provides a unified way of using LSTM for both deep and sequential computation. We apply the model to algorithmic tasks such as 15-digit integer addition and sequence memorization, where it is able to significantly outperform the standard LSTM. We then give results for two empirical tasks. We find that 2D Grid LSTM achieves 1.47 bits per character on the Wikipedia character prediction benchmark, which is state-of-the-art among neural approaches. In addition, we use the Grid LSTM to define a novel two-dimensional translation model, the Reencoder, and show that it outperforms a phrase-based reference system on a Chinese-to-English translation task.

* 15 pages

Via

Access Paper or Ask Questions

Resolving Lexical Ambiguity in Tensor Regression Models of Meaning

Aug 26, 2014

Dimitri Kartsaklis, Nal Kalchbrenner, Mehrnoosh Sadrzadeh

Figure 1 for Resolving Lexical Ambiguity in Tensor Regression Models of Meaning

Figure 2 for Resolving Lexical Ambiguity in Tensor Regression Models of Meaning

Figure 3 for Resolving Lexical Ambiguity in Tensor Regression Models of Meaning

Abstract:This paper provides a method for improving tensor-based compositional distributional models of meaning by the addition of an explicit disambiguation step prior to composition. In contrast with previous research where this hypothesis has been successfully tested against relatively simple compositional models, in our work we use a robust model trained with linear regression. The results we get in two experiments show the superiority of the prior disambiguation method and suggest that the effectiveness of this approach is model-independent.

* Proceedings of ACL 2014, Vol. 2:Short Papers, pp:212-217

Via

Access Paper or Ask Questions

Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Jun 15, 2014

Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas

Figure 1 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Figure 2 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Figure 3 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Figure 4 for Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network

Abstract:Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval. We introduce a model that is able to represent the meaning of documents by embedding them in a low dimensional vector space, while preserving distinctions of word and sentence order crucial for capturing nuanced semantics. Our model is based on an extended Dynamic Convolution Neural Network, which learns convolution filters at both the sentence and document level, hierarchically learning to capture and compose low level lexical features into high level semantic concepts. We demonstrate the effectiveness of this model on a range of document modelling tasks, achieving strong results with no feature engineering and with a more compact model. Inspired by recent advances in visualising deep convolution networks for computer vision, we present a novel visualisation technique for our document networks which not only provides insight into their learning process, but also can be interpreted to produce a compelling automatic summarisation system for texts.

Via

Access Paper or Ask Questions

A Convolutional Neural Network for Modelling Sentences

Apr 08, 2014

Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom

Figure 1 for A Convolutional Neural Network for Modelling Sentences

Figure 2 for A Convolutional Neural Network for Modelling Sentences

Figure 3 for A Convolutional Neural Network for Modelling Sentences

Figure 4 for A Convolutional Neural Network for Modelling Sentences

Abstract:The ability to accurately represent sentences is central to language understanding. We describe a convolutional architecture dubbed the Dynamic Convolutional Neural Network (DCNN) that we adopt for the semantic modelling of sentences. The network uses Dynamic k-Max Pooling, a global pooling operation over linear sequences. The network handles input sentences of varying length and induces a feature graph over the sentence that is capable of explicitly capturing short and long-range relations. The network does not rely on a parse tree and is easily applicable to any language. We test the DCNN in four experiments: small scale binary and multi-class sentiment prediction, six-way question classification and Twitter sentiment prediction by distant supervision. The network achieves excellent performance in the first three tasks and a greater than 25% error reduction in the last task with respect to the strongest baseline.

Via

Access Paper or Ask Questions

Recurrent Convolutional Neural Networks for Discourse Compositionality

Jun 15, 2013

Nal Kalchbrenner, Phil Blunsom

Figure 1 for Recurrent Convolutional Neural Networks for Discourse Compositionality

Figure 2 for Recurrent Convolutional Neural Networks for Discourse Compositionality

Figure 3 for Recurrent Convolutional Neural Networks for Discourse Compositionality

Figure 4 for Recurrent Convolutional Neural Networks for Discourse Compositionality

Abstract:The compositionality of meaning extends beyond the single sentence. Just as words combine to form the meaning of sentences, so do sentences combine to form the meaning of paragraphs, dialogues and general discourse. We introduce both a sentence model and a discourse model corresponding to the two levels of compositionality. The sentence model adopts convolution as the central operation for composing semantic vectors and is based on a novel hierarchical convolutional neural network. The discourse model extends the sentence model and is based on a recurrent neural network that is conditioned in a novel way both on the current sentence and on the current speaker. The discourse model is able to capture both the sequentiality of sentences and the interaction between different speakers. Without feature engineering or pretraining and with simple greedy decoding, the discourse model coupled to the sentence model obtains state of the art performance on a dialogue act classification experiment.

Via

Access Paper or Ask Questions