Alert button
Picture for John Blitzer

John Blitzer

Alert button

Integrated Triaging for Fast Reading Comprehension

Sep 28, 2019
Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger

Figure 1 for Integrated Triaging for Fast Reading Comprehension
Figure 2 for Integrated Triaging for Fast Reading Comprehension
Figure 3 for Integrated Triaging for Fast Reading Comprehension
Figure 4 for Integrated Triaging for Fast Reading Comprehension

Although according to several benchmarks automatic machine reading comprehension (MRC) systems have recently reached super-human performance, less attention has been paid to their computational efficiency. However, efficiency is of crucial importance for training and deployment in real world applications. This paper introduces Integrated Triaging, a framework that prunes almost all context in early layers of a network, leaving the remaining (deep) layers to scan only a tiny fraction of the full corpus. This pruning drastically increases the efficiency of MRC models and further prevents the later layers from overfitting to prevalent short paragraphs in the training set. Our framework is extremely flexible and naturally applicable to a wide variety of models. Our experiment on doc-SQuAD and TriviaQA tasks demonstrates its effectiveness in consistently improving both speed and quality of several diverse MRC models.

* Technical report 
Viaarxiv icon

FastFusionNet: New State-of-the-Art for DAWNBench SQuAD

Mar 02, 2019
Felix Wu, Boyi Li, Lequn Wang, Ni Lao, John Blitzer, Kilian Q. Weinberger

Figure 1 for FastFusionNet: New State-of-the-Art for DAWNBench SQuAD
Figure 2 for FastFusionNet: New State-of-the-Art for DAWNBench SQuAD
Figure 3 for FastFusionNet: New State-of-the-Art for DAWNBench SQuAD
Figure 4 for FastFusionNet: New State-of-the-Art for DAWNBench SQuAD

In this technical report, we introduce FastFusionNet, an efficient variant of FusionNet [12]. FusionNet is a high performing reading comprehension architecture, which was designed primarily for maximum retrieval accuracy with less regard towards computational requirements. For FastFusionNets we remove the expensive CoVe layers [21] and substitute the BiLSTMs with far more efficient SRU layers [19]. The resulting architecture obtains state-of-the-art results on DAWNBench [5] while achieving the lowest training and inference time on SQuAD [25] to-date. The code is available at https://github.com/felixgwu/FastFusionNet.

* A Technical Report 
Viaarxiv icon

Fast Reading Comprehension with ConvNets

Nov 12, 2017
Felix Wu, Ni Lao, John Blitzer, Guandao Yang, Kilian Weinberger

Figure 1 for Fast Reading Comprehension with ConvNets
Figure 2 for Fast Reading Comprehension with ConvNets
Figure 3 for Fast Reading Comprehension with ConvNets
Figure 4 for Fast Reading Comprehension with ConvNets

State-of-the-art deep reading comprehension models are dominated by recurrent neural nets. Their sequential nature is a natural fit for language, but it also precludes parallelization within an instances and often becomes the bottleneck for deploying such models to latency critical scenarios. This is particularly problematic for longer texts. Here we present a convolutional architecture as an alternative to these recurrent architectures. Using simple dilated convolutional units in place of recurrent ones, we achieve results comparable to the state of the art on two question answering tasks, while at the same time achieving up to two orders of magnitude speedups for question answering.

* 15 pages, 10 figures, submitted to ICLR 2018 
Viaarxiv icon

Evaluating Induced CCG Parsers on Grounded Semantic Parsing

Jan 31, 2017
Yonatan Bisk, Siva Reddy, John Blitzer, Julia Hockenmaier, Mark Steedman

Figure 1 for Evaluating Induced CCG Parsers on Grounded Semantic Parsing
Figure 2 for Evaluating Induced CCG Parsers on Grounded Semantic Parsing
Figure 3 for Evaluating Induced CCG Parsers on Grounded Semantic Parsing
Figure 4 for Evaluating Induced CCG Parsers on Grounded Semantic Parsing

We compare the effectiveness of four different syntactic CCG parsers for a semantic slot-filling task to explore how much syntactic supervision is required for downstream semantic analysis. This extrinsic, task-based evaluation provides a unique window to explore the strengths and weaknesses of semantics captured by unsupervised grammar induction systems. We release a new Freebase semantic parsing dataset called SPADES (Semantic PArsing of DEclarative Sentences) containing 93K cloze-style questions paired with answers. We evaluate all our models on this dataset. Our code and data are available at https://github.com/sivareddyg/graph-parser.

* EMNLP 2016, Table 2 erratum, Code and Freebase Semantic Parsing data URL 
Viaarxiv icon

Latent Structured Ranking

Oct 16, 2012
Jason Weston, John Blitzer

Figure 1 for Latent Structured Ranking
Figure 2 for Latent Structured Ranking
Figure 3 for Latent Structured Ranking
Figure 4 for Latent Structured Ranking

Many latent (factorized) models have been proposed for recommendation tasks like collaborative filtering and for ranking tasks like document or image retrieval and annotation. Common to all those methods is that during inference the items are scored independently by their similarity to the query in the latent embedding space. The structure of the ranked list (i.e. considering the set of items returned as a whole) is not taken into account. This can be a problem because the set of top predictions can be either too diverse (contain results that contradict each other) or are not diverse enough. In this paper we introduce a method for learning latent structured rankings that improves over existing methods by providing the right blend of predictions at the top of the ranked list. Particular emphasis is put on making this method scalable. Empirical results on large scale image annotation and music recommendation tasks show improvements over existing approaches.

* Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012) 
Viaarxiv icon

Multi-View Learning over Structured and Non-Identical Outputs

Jun 13, 2012
Kuzman Ganchev, Joao Graca, John Blitzer, Ben Taskar

Figure 1 for Multi-View Learning over Structured and Non-Identical Outputs
Figure 2 for Multi-View Learning over Structured and Non-Identical Outputs
Figure 3 for Multi-View Learning over Structured and Non-Identical Outputs
Figure 4 for Multi-View Learning over Structured and Non-Identical Outputs

In many machine learning problems, labeled training data is limited but unlabeled data is ample. Some of these problems have instances that can be factored into multiple views, each of which is nearly sufficent in determining the correct labels. In this paper we present a new algorithm for probabilistic multi-view learning which uses the idea of stochastic agreement between views as regularization. Our algorithm works on structured and unstructured problems and easily generalizes to partial agreement scenarios. For the full agreement case, our algorithm minimizes the Bhattacharyya distance between the models of each view, and performs better than CoBoosting and two-view Perceptron on several flat and structured classification problems.

* Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008) 
Viaarxiv icon