Alert button
Picture for Yaser Al-Onaizan

Yaser Al-Onaizan

Alert button

Label Semantics for Few Shot Named Entity Recognition

Mar 16, 2022
Jie Ma, Miguel Ballesteros, Srikanth Doss, Rishita Anubhai, Sunil Mallya, Yaser Al-Onaizan, Dan Roth

Figure 1 for Label Semantics for Few Shot Named Entity Recognition
Figure 2 for Label Semantics for Few Shot Named Entity Recognition
Figure 3 for Label Semantics for Few Shot Named Entity Recognition
Figure 4 for Label Semantics for Few Shot Named Entity Recognition

We study the problem of few shot learning for named entity recognition. Specifically, we leverage the semantic information in the names of the labels as a way of giving the model additional signal and enriched priors. We propose a neural architecture that consists of two BERT encoders, one to encode the document and its tokens and another one to encode each of the labels in natural language format. Our model learns to match the representations of named entities computed by the first encoder with label representations computed by the second encoder. The label semantics signal is shown to support improved state-of-the-art results in multiple few shot NER benchmarks and on-par performance in standard benchmarks. Our model is especially effective in low resource settings.

* Findings of ACL 2022 
Viaarxiv icon

Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

Jun 17, 2021
Elsbeth Turcan, Shuai Wang, Rishita Anubhai, Kasturi Bhattacharjee, Yaser Al-Onaizan, Smaranda Muresan

Figure 1 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction
Figure 2 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction
Figure 3 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction
Figure 4 for Multi-Task Learning and Adapted Knowledge Models for Emotion-Cause Extraction

Detecting what emotions are expressed in text is a well-studied problem in natural language processing. However, research on finer grained emotion analysis such as what causes an emotion is still in its infancy. We present solutions that tackle both emotion recognition and emotion cause detection in a joint fashion. Considering that common-sense knowledge plays an important role in understanding implicitly expressed emotions and the reasons for those emotions, we propose novel methods that combine common-sense knowledge via adapted knowledge models with multi-task learning to perform joint emotion classification and emotion cause tagging. We show performance improvement on both tasks when including common-sense reasoning and a multitask framework. We provide a thorough analysis to gain insights into model performance.

* 15 pages, 6 figures. Findings of ACL 2021 
Viaarxiv icon

To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Oct 27, 2020
Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai, Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

Figure 1 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
Figure 2 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
Figure 3 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
Figure 4 for To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Leveraging large amounts of unlabeled data using Transformer-like architectures, like BERT, has gained popularity in recent times owing to their effectiveness in learning general representations that can then be further fine-tuned for downstream tasks to much success. However, training these models can be costly both from an economic and environmental standpoint. In this work, we investigate how to effectively use unlabeled data: by exploring the task-specific semi-supervised approach, Cross-View Training (CVT) and comparing it with task-agnostic BERT in multiple settings that include domain and task relevant English data. CVT uses a much lighter model architecture and we show that it achieves similar performance to BERT on a set of sequence tagging tasks, with lesser financial and environmental impact.

* Accepted in the Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020)(https://2020.emnlp.org/papers/main) 
Viaarxiv icon

Resource-Enhanced Neural Model for Event Argument Extraction

Oct 06, 2020
Jie Ma, Shuai Wang, Rishita Anubhai, Miguel Ballesteros, Yaser Al-Onaizan

Figure 1 for Resource-Enhanced Neural Model for Event Argument Extraction
Figure 2 for Resource-Enhanced Neural Model for Event Argument Extraction
Figure 3 for Resource-Enhanced Neural Model for Event Argument Extraction
Figure 4 for Resource-Enhanced Neural Model for Event Argument Extraction

Event argument extraction (EAE) aims to identify the arguments of an event and classify the roles that those arguments play. Despite great efforts made in prior work, there remain many challenges: (1) Data scarcity. (2) Capturing the long-range dependency, specifically, the connection between an event trigger and a distant event argument. (3) Integrating event trigger information into candidate argument representation. For (1), we explore using unlabeled data in different ways. For (2), we propose to use a syntax-attending Transformer that can utilize dependency parses to guide the attention mechanism. For (3), we propose a trigger-aware sequence encoder with several types of trigger-dependent sequence representations. We also support argument extraction either from text annotated with gold entities or from plain text. Experiments on the English ACE2005 benchmark show that our approach achieves a new state-of-the-art.

* Findings of EMNLP 2020 
Viaarxiv icon

Exploring Content Selection in Summarization of Novel Chapters

May 07, 2020
Faisal Ladhak, Bryan Li, Yaser Al-Onaizan, Kathleen McKeown

Figure 1 for Exploring Content Selection in Summarization of Novel Chapters
Figure 2 for Exploring Content Selection in Summarization of Novel Chapters
Figure 3 for Exploring Content Selection in Summarization of Novel Chapters
Figure 4 for Exploring Content Selection in Summarization of Novel Chapters

We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides. This is a harder task than the news summarization task, given the chapter length as well as the extreme paraphrasing and generalization found in the summaries. We focus on extractive summarization, which requires the creation of a gold-standard set of extractive summaries. We present a new metric for aligning reference summary sentences with chapter sentences to create gold extracts and also experiment with different alignment methods. Our experiments demonstrate significant improvement over prior alignment approaches for our task as shown through automatic metrics and a crowd-sourced pyramid analysis.

* Accepted to ACL 2020 
Viaarxiv icon

Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions

May 04, 2020
Arjun R Akula, Spandana Gella, Yaser Al-Onaizan, Song-Chun Zhu, Siva Reddy

Figure 1 for Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions
Figure 2 for Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions
Figure 3 for Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions
Figure 4 for Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions

Visual referring expression recognition is a challenging task that requires natural language understanding in the context of an image. We critically examine RefCOCOg, a standard benchmark for this task, using a human study and show that 83.7% of test instances do not require reasoning on linguistic structure, i.e., words are enough to identify the target object, the word order doesn't matter. To measure the true progress of existing models, we split the test set into two sets, one which requires reasoning on linguistic structure and the other which doesn't. Additionally, we create an out-of-distribution dataset Ref-Adv by asking crowdworkers to perturb in-domain examples such that the target object changes. Using these datasets, we empirically show that existing methods fail to exploit linguistic structure and are 12% to 23% lower in performance than the established progress for this task. We also propose two methods, one based on contrastive learning and the other based on multi-task learning, to increase the robustness of ViLBERT, the current state-of-the-art model for this task. Our datasets are publicly available at https://github.com/aws/aws-refcocog-adv

* ACL 2020 
Viaarxiv icon

Evaluating Robustness to Input Perturbations for Neural Machine Translation

May 01, 2020
Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan

Figure 1 for Evaluating Robustness to Input Perturbations for Neural Machine Translation
Figure 2 for Evaluating Robustness to Input Perturbations for Neural Machine Translation
Figure 3 for Evaluating Robustness to Input Perturbations for Neural Machine Translation
Figure 4 for Evaluating Robustness to Input Perturbations for Neural Machine Translation

Neural Machine Translation (NMT) models are sensitive to small perturbations in the input. Robustness to such perturbations is typically measured using translation quality metrics such as BLEU on the noisy input. This paper proposes additional metrics which measure the relative degradation and changes in translation when small perturbations are added to the input. We focus on a class of models employing subword regularization to address robustness and perform extensive evaluations of these models using the robustness measures proposed. Results show that our proposed metrics reveal a clear trend of improved robustness to perturbations when subword regularization methods are used.

* Accepted at ACL 2020 
Viaarxiv icon

Joint translation and unit conversion for end-to-end localization

Apr 10, 2020
Georgiana Dinu, Prashant Mathur, Marcello Federico, Stanislas Lauly, Yaser Al-Onaizan

Figure 1 for Joint translation and unit conversion for end-to-end localization
Figure 2 for Joint translation and unit conversion for end-to-end localization
Figure 3 for Joint translation and unit conversion for end-to-end localization
Figure 4 for Joint translation and unit conversion for end-to-end localization

A variety of natural language tasks require processing of textual data which contains a mix of natural language and formal languages such as mathematical expressions. In this paper, we take unit conversions as an example and propose a data augmentation technique which leads to models learning both translation and conversion tasks as well as how to adequately switch between them for end-to-end localization.

Viaarxiv icon

Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

Apr 08, 2020
Miguel Ballesteros, Rishita Anubhai, Shuai Wang, Nima Pourdamghani, Yogarshi Vyas, Jie Ma, Parminder Bhatia, Kathleen McKeown, Yaser Al-Onaizan

Figure 1 for Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events

In this paper, we propose a neural architecture and a set of training methods for ordering events by predicting temporal relations. Our proposed models receive a pair of events within a span of text as input and they identify temporal relations (Before, After, Equal, Vague) between them. Given that a key challenge with this task is the scarcity of annotated data, our models rely on either pretrained representations (i.e. RoBERTa, BERT or ELMo), transfer and multi-task learning (by leveraging complementary datasets), and self-training techniques. Experiments on the MATRES dataset of English documents establish a new state-of-the-art on this task.

Viaarxiv icon