Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Avirup Sil

Multilingual Transfer Learning for QA Using Translation as Data Augmentation

Dec 10, 2020

Mihaela Bornea, Lin Pan, Sara Rosenthal, Radu Florian, Avirup Sil

Figure 1 for Multilingual Transfer Learning for QA Using Translation as Data Augmentation

Figure 2 for Multilingual Transfer Learning for QA Using Translation as Data Augmentation

Figure 3 for Multilingual Transfer Learning for QA Using Translation as Data Augmentation

Figure 4 for Multilingual Transfer Learning for QA Using Translation as Data Augmentation

Abstract:Prior work on multilingual question answering has mostly focused on using large multilingual pre-trained language models (LM) to perform zero-shot language-wise learning: train a QA model on English and test on other languages. In this work, we explore strategies that improve cross-lingual transfer by bringing the multilingual embeddings closer in the semantic space. Our first strategy augments the original English training data with machine translation-generated data. This results in a corpus of multilingual silver-labeled QA pairs that is 14 times larger than the original training set. In addition, we propose two novel strategies, language adversarial training and language arbitration framework, which significantly improve the (zero-resource) cross-lingual transfer performance and result in LM embeddings that are less language-variant. Empirically, we show that the proposed models outperform the previous zero-shot baseline on the recently introduced multilingual MLQA and TyDiQA datasets.

* AAAI 2021

Via

Access Paper or Ask Questions

Answer Span Correction in Machine Reading Comprehension

Nov 06, 2020

Revanth Gangi Reddy, Md Arafat Sultan, Efsun Sarioglu Kayi, Rong Zhang, Vittorio Castelli, Avirup Sil

Figure 1 for Answer Span Correction in Machine Reading Comprehension

Figure 2 for Answer Span Correction in Machine Reading Comprehension

Figure 3 for Answer Span Correction in Machine Reading Comprehension

Figure 4 for Answer Span Correction in Machine Reading Comprehension

Abstract:Answer validation in machine reading comprehension (MRC) consists of verifying an extracted answer against an input context and question pair. Previous work has looked at re-assessing the "answerability" of the question given the extracted answer. Here we address a different problem: the tendency of existing MRC systems to produce partially correct answers when presented with answerable questions. We explore the nature of such errors and propose a post-processing correction method that yields statistically significant performance improvements over state-of-the-art MRC systems in both monolingual and multilingual evaluation.

* Accepted in Findings of EMNLP 2020

Via

Access Paper or Ask Questions

Multi-Stage Pre-training for Low-Resource Domain Adaptation

Oct 12, 2020

Rong Zhang, Revanth Gangi Reddy, Md Arafat Sultan, Vittorio Castelli, Anthony Ferritto, Radu Florian, Efsun Sarioglu Kayi, Salim Roukos, Avirup Sil, Todd Ward

Figure 1 for Multi-Stage Pre-training for Low-Resource Domain Adaptation

Figure 2 for Multi-Stage Pre-training for Low-Resource Domain Adaptation

Figure 3 for Multi-Stage Pre-training for Low-Resource Domain Adaptation

Figure 4 for Multi-Stage Pre-training for Low-Resource Domain Adaptation

Abstract:Transfer learning techniques are particularly useful in NLP tasks where a sizable amount of high-quality annotated data is difficult to obtain. Current approaches directly adapt a pre-trained language model (LM) on in-domain text before fine-tuning to downstream tasks. We show that extending the vocabulary of the LM with domain-specific terms leads to further gains. To a bigger effect, we utilize structure in the unlabeled data to create auxiliary synthetic tasks, which helps the LM transfer to downstream tasks. We apply these approaches incrementally on a pre-trained Roberta-large LM and show considerable performance gain on three tasks in the IT domain: Extractive Reading Comprehension, Document Ranking and Duplicate Question Detection.

* Accepted at EMNLP 2020

Via

Access Paper or Ask Questions

The TechQA Dataset

Nov 08, 2019

Vittorio Castelli, Rishav Chakravarti, Saswati Dana, Anthony Ferritto, Radu Florian, Martin Franz, Dinesh Garg, Dinesh Khandelwal, Scott McCarley, Mike McCawley(+11 more)

Abstract:We introduce TechQA, a domain-adaptation question answering dataset for the technical support domain. The TechQA corpus highlights two real-world issues from the automated customer support domain. First, it contains actual questions posed by users on a technical forum, rather than questions generated specifically for a competition or a task. Second, it has a real-world size -- 600 training, 310 dev, and 490 evaluation question/answer pairs -- thus reflecting the cost of creating large labeled datasets with actual data. Consequently, TechQA is meant to stimulate research in domain adaptation rather than being a resource to build QA systems from scratch. The dataset was obtained by crawling the IBM Developer and IBM DeveloperWorks forums for questions with accepted answers that appear in a published IBM Technote---a technical document that addresses a specific technical issue. We also release a collection of the 801,998 publicly available Technotes as of April 4, 2019 as a companion resource that might be used for pretraining, to learn representations of the IT domain language.

* Long version of conference paper to be submitted

Via

Access Paper or Ask Questions

Ensembling Strategies for Answering Natural Questions

Nov 06, 2019

Anthony Ferritto, Lin Pan, Rishav Chakravarti, Salim Roukos, Radu Florian, J. William Murdock, Avirup Sil

Figure 1 for Ensembling Strategies for Answering Natural Questions

Figure 2 for Ensembling Strategies for Answering Natural Questions

Figure 3 for Ensembling Strategies for Answering Natural Questions

Figure 4 for Ensembling Strategies for Answering Natural Questions

Abstract:Many of the top question answering systems today utilize ensembling to improve their performance on tasks such as the Stanford Question Answering Dataset (SQuAD) and Natural Questions (NQ) challenges. Unfortunately most of these systems do not publish their ensembling strategies used in their leaderboard submissions. In this work, we investigate a number of ensembling techniques and demonstrate a strategy which improves our F1 score for short answers on the dev set for NQ by 2.3 F1 points over our single model (which outperforms the previous SOTA by 1.9 F1 points).

* arXiv admin note: text overlap with arXiv:1909.05286

Via

Access Paper or Ask Questions

Frustratingly Easy Natural Question Answering

Sep 11, 2019

Lin Pan, Rishav Chakravarti, Anthony Ferritto, Michael Glass, Alfio Gliozzo, Salim Roukos, Radu Florian, Avirup Sil

Figure 1 for Frustratingly Easy Natural Question Answering

Figure 2 for Frustratingly Easy Natural Question Answering

Figure 3 for Frustratingly Easy Natural Question Answering

Figure 4 for Frustratingly Easy Natural Question Answering

Abstract:Existing literature on Question Answering (QA) mostly focuses on algorithmic novelty, data augmentation, or increasingly large pre-trained language models like XLNet and RoBERTa. Additionally, a lot of systems on the QA leaderboards do not have associated research documentation in order to successfully replicate their experiments. In this paper, we outline these algorithmic components such as Attention-over-Attention, coupled with data augmentation and ensembling strategies that have shown to yield state-of-the-art results on benchmark datasets like SQuAD, even achieving super-human performance. Contrary to these prior results, when we evaluate on the recently proposed Natural Questions benchmark dataset, we find that an incredibly simple approach of transfer learning from BERT outperforms the previous state-of-the-art system trained on 4 million more examples than ours by 1.9 F1 points. Adding ensembling strategies further improves that number by 2.3 F1 points.

Via

Access Paper or Ask Questions

Span Selection Pre-training for Question Answering

Sep 09, 2019

Michael Glass, Alfio Gliozzo, Rishav Chakravarti, Anthony Ferritto, Lin Pan, G P Shrivatsa Bhargav, Dinesh Garg, Avirup Sil

Figure 1 for Span Selection Pre-training for Question Answering

Figure 2 for Span Selection Pre-training for Question Answering

Figure 3 for Span Selection Pre-training for Question Answering

Figure 4 for Span Selection Pre-training for Question Answering

Abstract:BERT (Bidirectional Encoder Representations from Transformers) and related pre-trained Transformers have provided large gains across many language understanding tasks, achieving a new state-of-the-art (SOTA). BERT is pre-trained on two auxiliary tasks: Masked Language Model and Next Sentence Prediction. In this paper we introduce a new pre-training task inspired by reading comprehension and an effort to avoid encoding general knowledge in the transformer network itself. We find significant and consistent improvements over both BERT-BASE and BERT-LARGE on multiple reading comprehension (MRC) and paraphrasing datasets. Specifically, our proposed model has strong empirical evidence as it obtains SOTA results on Natural Questions, a new benchmark MRC dataset, outperforming BERT-LARGE by 3 F1 points on short answer prediction. We also establish a new SOTA in HotpotQA, improving answer prediction F1 by 4 F1 points and supporting fact prediction by 1 F1 point. Moreover, we show that our pre-training approach is particularly effective when training data is limited, improving the learning curve by a large amount.

Via

Access Paper or Ask Questions

CFO: A Framework for Building Production NLP Systems

Aug 30, 2019

Rishav Chakravarti, Cezar Pendus, Andrzej Sakrajda, Anthony Ferritto, Lin Pan, Michael Glass, Vittorio Castelli, J. William Murdock, Radu Florian, Salim Roukos(+1 more)

Figure 1 for CFO: A Framework for Building Production NLP Systems

Figure 2 for CFO: A Framework for Building Production NLP Systems

Figure 3 for CFO: A Framework for Building Production NLP Systems

Figure 4 for CFO: A Framework for Building Production NLP Systems

Abstract:This paper introduces a novel orchestration framework, called CFO (COMPUTATION FLOW ORCHESTRATOR), for building, experimenting with, and deploying interactive NLP (Natural Language Processing) and IR (Information Retrieval) systems to production environments. We then demonstrate a question answering system built using this framework which incorporates state-of-the-art BERT based MRC (Machine Reading Comprehension) with IR components to enable end-to-end answer retrieval. Results from the demo system are shown to be high quality in both academic and industry domain specific settings. Finally, we discuss best practices when (pre-)training BERT based MRC models for production systems.

* Will appear in EMNLP 2019

Via

Access Paper or Ask Questions

Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

Jun 26, 2018

Gourab Kundu, Avirup Sil, Radu Florian, Wael Hamza

Figure 1 for Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

Figure 2 for Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

Figure 3 for Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

Figure 4 for Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking

Abstract:We propose an entity-centric neural cross-lingual coreference model that builds on multi-lingual embeddings and language-independent features. We perform both intrinsic and extrinsic evaluations of our model. In the intrinsic evaluation, we show that our model, when trained on English and tested on Chinese and Spanish, achieves competitive results to the models trained directly on Chinese and Spanish respectively. In the extrinsic evaluation, we show that our English model helps achieve superior entity linking accuracy on Chinese and Spanish test sets than the top 2015 TAC system without using any annotated data from Chinese or Spanish.

* ACL 2018

Via

Access Paper or Ask Questions

Neural Cross-Lingual Entity Linking

Dec 05, 2017

Avirup Sil, Gourab Kundu, Radu Florian, Wael Hamza

Figure 1 for Neural Cross-Lingual Entity Linking

Figure 2 for Neural Cross-Lingual Entity Linking

Figure 3 for Neural Cross-Lingual Entity Linking

Figure 4 for Neural Cross-Lingual Entity Linking

Abstract:A major challenge in Entity Linking (EL) is making effective use of contextual information to disambiguate mentions to Wikipedia that might refer to different entities in different contexts. The problem exacerbates with cross-lingual EL which involves linking mentions written in non-English documents to entries in the English Wikipedia: to compare textual clues across languages we need to compute similarity between textual fragments across languages. In this paper, we propose a neural EL model that trains fine-grained similarities and dissimilarities between the query and candidate document from multiple perspectives, combined with convolution and tensor networks. Further, we show that this English-trained system can be applied, in zero-shot learning, to other languages by making surprisingly effective use of multi-lingual embeddings. The proposed system has strong empirical evidence yielding state-of-the-art results in English as well as cross-lingual: Spanish and Chinese TAC 2015 datasets.

* Association for the Advancement of Artificial Intelligence (AAAI), 2018

Via

Access Paper or Ask Questions