Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chenguang Zhu

APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Dec 19, 2022
Soumya Sanyal, Yichong Xu, Shuohang Wang, Ziyi Yang, Reid Pryzant, Wenhao Yu, Chenguang Zhu, Xiang Ren

Figure 1 for APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Figure 2 for APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Figure 3 for APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Figure 4 for APOLLO: A Simple Approach for Adaptive Pretraining of Language Models for Logical Reasoning

Logical reasoning of text is an important ability that requires understanding the information present in the text, their interconnections, and then reasoning through them to infer new conclusions. Prior works on improving the logical reasoning ability of language models require complex processing of training data (e.g., aligning symbolic knowledge to text), yielding task-specific data augmentation solutions that restrict the learning of general logical reasoning skills. In this work, we propose APOLLO, an adaptively pretrained language model that has improved logical reasoning abilities. We select a subset of Wikipedia, based on a set of logical inference keywords, for continued pretraining of a language model. We use two self-supervised loss functions: a modified masked language modeling loss where only specific parts-of-speech words, that would likely require more reasoning than basic language understanding, are masked, and a sentence-level classification loss that teaches the model to distinguish between entailment and contradiction types of sentences. The proposed training paradigm is both simple and independent of task formats. We demonstrate the effectiveness of APOLLO by comparing it with prior baselines on two logical reasoning datasets. APOLLO performs comparably on ReClor and outperforms baselines on LogiQA.

* 11 pages, 5 figures

Via

Access Paper or Ask Questions

UniSumm: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning

Dec 06, 2022
Yulong Chen, Yang Liu, Ruochen Xu, Ziyi Yang, Chenguang Zhu, Michael Zeng, Yue Zhang

Figure 1 for UniSumm: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning

Figure 2 for UniSumm: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning

Figure 3 for UniSumm: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning

Figure 4 for UniSumm: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning

The diverse demands of different summarization tasks and their high annotation costs are driving a need for few-shot summarization. However, despite the emergence of many summarization tasks and datasets, the current training paradigm for few-shot summarization systems ignores potentially shareable knowledge in heterogeneous datasets. To this end, we propose \textsc{UniSumm}, a unified few-shot summarization model pre-trained with multiple summarization tasks and can be prefix-tuned to excel at any few-shot summarization datasets. Meanwhile, to better evaluate few-shot summarization systems, under the principles of diversity and robustness, we assemble and publicize a new benchmark \textsc{SummZoo}. It consists of $8$ diverse summarization tasks with multiple sets of few-shot samples for each task, covering both monologue and dialogue domains. Experimental results and ablation studies show that \textsc{UniSumm} outperforms strong baseline systems by a large margin across all tasks in \textsc{SummZoo} under both automatic and human evaluations. We release our code and benchmark at \url{https://github.com/microsoft/UniSumm}.

Via

Access Paper or Ask Questions

Unifying Vision, Text, and Layout for Universal Document Processing

Dec 05, 2022
Zineng Tang, Ziyi Yang, Guoxin Wang, Yuwei Fang, Yang Liu, Chenguang Zhu, Michael Zeng, Cha Zhang, Mohit Bansal

Figure 1 for Unifying Vision, Text, and Layout for Universal Document Processing

Figure 2 for Unifying Vision, Text, and Layout for Universal Document Processing

Figure 3 for Unifying Vision, Text, and Layout for Universal Document Processing

Figure 4 for Unifying Vision, Text, and Layout for Universal Document Processing

We propose Universal Document Processing (UDOP), a foundation Document AI model which unifies text, image, and layout modalities together with varied task formats, including document understanding and generation. UDOP leverages the spatial correlation between textual content and document image to model image, text, and layout modalities with one uniform representation. With a novel Vision-Text-Layout Transformer, UDOP unifies pretraining and multi-domain downstream tasks into a prompt-based sequence generation scheme. UDOP is pretrained on both large-scale unlabeled document corpora using innovative self-supervised objectives and diverse labeled data. UDOP also learns to generate document images from text and layout modalities via masked image reconstruction. To the best of our knowledge, this is the first time in the field of document AI that one model simultaneously achieves high-quality neural document editing and content customization. Our method sets the state-of-the-art on 9 Document AI tasks, e.g., document understanding and QA, across diverse data domains like finance reports, academic papers, and websites. UDOP ranks first on the leaderboard of the Document Understanding Benchmark (DUE).

Via

Access Paper or Ask Questions

Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles

Nov 29, 2022
Shuquan Ye, Yujia Xie, Dongdong Chen, Yichong Xu, Lu Yuan, Chenguang Zhu, Jing Liao

Figure 1 for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles

Figure 2 for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles

Figure 3 for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles

Figure 4 for Improving Commonsense in Vision-Language Models via Knowledge Graph Riddles

This paper focuses on analyzing and improving the commonsense ability of recent popular vision-language (VL) models. Despite the great success, we observe that existing VL-models still lack commonsense knowledge/reasoning ability (e.g., "Lemons are sour"), which is a vital component towards artificial general intelligence. Through our analysis, we find one important reason is that existing large-scale VL datasets do not contain much commonsense knowledge, which motivates us to improve the commonsense of VL-models from the data perspective. Rather than collecting a new VL training dataset, we propose a more scalable strategy, i.e., "Data Augmentation with kNowledge graph linearization for CommonsensE capability" (DANCE). It can be viewed as one type of data augmentation technique, which can inject commonsense knowledge into existing VL datasets on the fly during training. More specifically, we leverage the commonsense knowledge graph (e.g., ConceptNet) and create variants of text description in VL datasets via bidirectional sub-graph sequentialization. For better commonsense evaluation, we further propose the first retrieval-based commonsense diagnostic benchmark. By conducting extensive experiments on some representative VL-models, we demonstrate that our DANCE technique is able to significantly improve the commonsense ability while maintaining the performance on vanilla retrieval tasks. The code and data are available at https://github.com/pleaseconnectwifi/DANCE

* Code: https://github.com/pleaseconnectwifi/DANCE Project page: shuquanye.com/DANCE_website

Via

Access Paper or Ask Questions

Empowering Language Models with Knowledge Graph Reasoning for Question Answering

Nov 15, 2022
Ziniu Hu, Yichong Xu, Wenhao Yu, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Kai-Wei Chang, Yizhou Sun

Figure 1 for Empowering Language Models with Knowledge Graph Reasoning for Question Answering

Figure 2 for Empowering Language Models with Knowledge Graph Reasoning for Question Answering

Figure 3 for Empowering Language Models with Knowledge Graph Reasoning for Question Answering

Figure 4 for Empowering Language Models with Knowledge Graph Reasoning for Question Answering

Answering open-domain questions requires world knowledge about in-context entities. As pre-trained Language Models (LMs) lack the power to store all required knowledge, external knowledge sources, such as knowledge graphs, are often used to augment LMs. In this work, we propose knOwledge REasOning empowered Language Model (OREO-LM), which consists of a novel Knowledge Interaction Layer that can be flexibly plugged into existing Transformer-based LMs to interact with a differentiable Knowledge Graph Reasoning module collaboratively. In this way, LM guides KG to walk towards the desired answer, while the retrieved knowledge improves LM. By adopting OREO-LM to RoBERTa and T5, we show significant performance gain, achieving state-of-art results in the Closed-Book setting. The performance enhancement is mainly from the KG reasoning's capacity to infer missing relational facts. In addition, OREO-LM provides reasoning paths as rationales to interpret the model's decision.

* Published on EMNLP 2022

Via

Access Paper or Ask Questions

MACSum: Controllable Summarization with Mixed Attributes

Nov 09, 2022
Yusen Zhang, Yang Liu, Ziyi Yang, Yuwei Fang, Yulong Chen, Dragomir Radev, Chenguang Zhu, Michael Zeng, Rui Zhang

Figure 1 for MACSum: Controllable Summarization with Mixed Attributes

Figure 2 for MACSum: Controllable Summarization with Mixed Attributes

Figure 3 for MACSum: Controllable Summarization with Mixed Attributes

Figure 4 for MACSum: Controllable Summarization with Mixed Attributes

Controllable summarization allows users to generate customized summaries with specified attributes. However, due to the lack of designated annotations of controlled summaries, existing works have to craft pseudo datasets by adapting generic summarization benchmarks. Furthermore, most research focuses on controlling single attributes individually (e.g., a short summary or a highly abstractive summary) rather than controlling a mix of attributes together (e.g., a short and highly abstractive summary). In this paper, we propose MACSum, the first human-annotated summarization dataset for controlling mixed attributes. It contains source texts from two domains, news articles and dialogues, with human-annotated summaries controlled by five designed attributes (Length, Extractiveness, Specificity, Topic, and Speaker). We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning. Results and analysis demonstrate that hard prompt models yield the best performance on all metrics and human evaluations. However, mixed-attribute control is still challenging for summarization tasks. Our dataset and code are available at https://github.com/psunlpgroup/MACSum.

* 14 pages, 7 figures

Via

Access Paper or Ask Questions

Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

Oct 23, 2022
Wenhao Yu, Chenguang Zhu, Zhihan Zhang, Shuohang Wang, Zhuosheng Zhang, Yuwei Fang, Meng Jiang

Figure 1 for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

Figure 2 for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

Figure 3 for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

Figure 4 for Retrieval Augmentation for Commonsense Reasoning: A Unified Approach

A common thread of retrieval-augmented methods in the existing literature focuses on retrieving encyclopedic knowledge, such as Wikipedia, which facilitates well-defined entity and relation spaces that can be modeled. However, applying such methods to commonsense reasoning tasks faces two unique challenges, i.e., the lack of a general large-scale corpus for retrieval and a corresponding effective commonsense retriever. In this paper, we systematically investigate how to leverage commonsense knowledge retrieval to improve commonsense reasoning tasks. We proposed a unified framework of retrieval-augmented commonsense reasoning (called RACo), including a newly constructed commonsense corpus with over 20 million documents and novel strategies for training a commonsense retriever. We conducted experiments on four different commonsense reasoning tasks. Extensive evaluation results showed that our proposed RACo can significantly outperform other knowledge-enhanced method counterparts, achieving new SoTA performance on the CommonGen and CREAK leaderboards.

* EMNLP 2022 (main)

Via

Access Paper or Ask Questions

Tail Batch Sampling: Approximating Global Contrastive Losses as Optimization over Batch Assignments

Oct 23, 2022
Vin Sachidananda, Ziyi Yang, Chenguang Zhu

Figure 1 for Tail Batch Sampling: Approximating Global Contrastive Losses as Optimization over Batch Assignments

Figure 2 for Tail Batch Sampling: Approximating Global Contrastive Losses as Optimization over Batch Assignments

Figure 3 for Tail Batch Sampling: Approximating Global Contrastive Losses as Optimization over Batch Assignments

Figure 4 for Tail Batch Sampling: Approximating Global Contrastive Losses as Optimization over Batch Assignments

Contrastive Learning has recently achieved state-of-the-art performance in a wide range of tasks. Many contrastive learning approaches use mined hard negatives to make batches more informative during training but these approaches are inefficient as they increase epoch length proportional to the number of mined negatives and require frequent updates of nearest neighbor indices or mining from recent batches. In this work, we provide an alternative to hard negative mining in supervised contrastive learning, Tail Batch Sampling (TBS), an efficient approximation to the batch assignment problem that upper bounds the gap between the global and training losses, $\mathcal{L}^{Global} - \mathcal{L}^{Train}$. TBS \textbf{improves state-of-the-art performance} in sentence embedding (+0.37 Spearman) and code-search tasks (+2.2\% MRR), is easy to implement - requiring only a few additional lines of code, does not maintain external data structures such as nearest neighbor indices, is more computationally efficient when compared to the most minimal hard negative mining approaches, and makes no changes to the model being trained.

* 18 pages, 5 figures

Via

Access Paper or Ask Questions

Towards a Unified Multi-Dimensional Evaluator for Text Generation

Oct 13, 2022
Ming Zhong, Yang Liu, Da Yin, Yuning Mao, Yizhu Jiao, Pengfei Liu, Chenguang Zhu, Heng Ji, Jiawei Han

Figure 1 for Towards a Unified Multi-Dimensional Evaluator for Text Generation

Figure 2 for Towards a Unified Multi-Dimensional Evaluator for Text Generation

Figure 3 for Towards a Unified Multi-Dimensional Evaluator for Text Generation

Figure 4 for Towards a Unified Multi-Dimensional Evaluator for Text Generation

Multi-dimensional evaluation is the dominant paradigm for human evaluation in Natural Language Generation (NLG), i.e., evaluating the generated text from multiple explainable dimensions, such as coherence and fluency. However, automatic evaluation in NLG is still dominated by similarity-based metrics, and we lack a reliable framework for a more comprehensive evaluation of advanced models. In this paper, we propose a unified multi-dimensional evaluator UniEval for NLG. We re-frame NLG evaluation as a Boolean Question Answering (QA) task, and by guiding the model with different questions, we can use one evaluator to evaluate from multiple dimensions. Furthermore, thanks to the unified Boolean QA format, we are able to introduce an intermediate learning phase that enables UniEval to incorporate external knowledge from multiple related tasks and gain further improvement. Experiments on three typical NLG tasks show that UniEval correlates substantially better with human judgments than existing metrics. Specifically, compared to the top-performing unified evaluators, UniEval achieves a 23% higher correlation on text summarization, and over 43% on dialogue response generation. Also, UniEval demonstrates a strong zero-shot learning ability for unseen evaluation dimensions and tasks. Source code, data and all pre-trained evaluators are available on our GitHub repository (https://github.com/maszhongming/UniEval).

* EMNLP 2022

Via

Access Paper or Ask Questions