Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Muhao Chen

University of California Davis

Dense Retrieval as Indirect Supervision for Large-space Decision Making

Oct 28, 2023

Nan Xu, Fei Wang, Mingtao Dong, Muhao Chen

Figure 1 for Dense Retrieval as Indirect Supervision for Large-space Decision Making

Figure 2 for Dense Retrieval as Indirect Supervision for Large-space Decision Making

Figure 3 for Dense Retrieval as Indirect Supervision for Large-space Decision Making

Figure 4 for Dense Retrieval as Indirect Supervision for Large-space Decision Making

Abstract:Many discriminative natural language understanding (NLU) tasks have large label spaces. Learning such a process of large-space decision making is particularly challenging due to the lack of training instances per label and the difficulty of selection among many fine-grained labels. Inspired by dense retrieval methods for passage finding in open-domain QA, we propose a reformulation of large-space discriminative NLU tasks as a learning-to-retrieve task, leading to a novel solution named Dense Decision Retrieval (DDR ). Instead of predicting fine-grained decisions as logits, DDR adopts a dual-encoder architecture that learns to predict by retrieving from a decision thesaurus. This approach not only leverages rich indirect supervision signals from easy-to-consume learning resources for dense retrieval, it also leads to enhanced prediction generalizability with a semantically meaningful representation of the large decision space. When evaluated on tasks with decision spaces ranging from hundreds to hundred-thousand scales, DDR outperforms strong baselines greatly by 27.54% in P@1 on two extreme multi-label classification tasks, 1.17% in F1 score ultra-fine entity typing, and 1.26% in accuracy on three few-shot intent classification tasks on average. Code and resources are available at https://github.com/luka-group/DDR

* EMNLP 2023 (Findings)

Via

Access Paper or Ask Questions

Affective and Dynamic Beam Search for Story Generation

Oct 23, 2023

Tenghao Huang, Ehsan Qasemi, Bangzheng Li, He Wang, Faeze Brahman, Muhao Chen, Snigdha Chaturvedi

Figure 1 for Affective and Dynamic Beam Search for Story Generation

Figure 2 for Affective and Dynamic Beam Search for Story Generation

Figure 3 for Affective and Dynamic Beam Search for Story Generation

Figure 4 for Affective and Dynamic Beam Search for Story Generation

Abstract:Storytelling's captivating potential makes it a fascinating research area, with implications for entertainment, education, therapy, and cognitive studies. In this paper, we propose Affective Story Generator (AffGen) for generating interesting narratives. AffGen introduces "intriguing twists" in narratives by employing two novel techniques-Dynamic Beam Sizing and Affective Reranking. Dynamic Beam Sizing encourages less predictable, more captivating word choices using a contextual multi-arm bandit model. Affective Reranking prioritizes sentence candidates based on affect intensity. Our empirical evaluations, both automatic and human, demonstrate AffGen's superior performance over existing baselines in generating affectively charged and interesting narratives. Our ablation study and analysis provide insights into the strengths and weaknesses of AffGen.

* Accepted at EMNLP-findings 2023

Via

Access Paper or Ask Questions

GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding

Oct 23, 2023

Zekun Li, Wenxuan Zhou, Yao-Yi Chiang, Muhao Chen

Figure 1 for GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding

Figure 2 for GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding

Figure 3 for GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding

Figure 4 for GeoLM: Empowering Language Models for Geospatially Grounded Language Understanding

Abstract:Humans subconsciously engage in geospatial reasoning when reading articles. We recognize place names and their spatial relations in text and mentally associate them with their physical locations on Earth. Although pretrained language models can mimic this cognitive process using linguistic context, they do not utilize valuable geospatial information in large, widely available geographical databases, e.g., OpenStreetMap. This paper introduces GeoLM, a geospatially grounded language model that enhances the understanding of geo-entities in natural language. GeoLM leverages geo-entity mentions as anchors to connect linguistic information in text corpora with geospatial information extracted from geographical databases. GeoLM connects the two types of context through contrastive learning and masked language modeling. It also incorporates a spatial coordinate embedding mechanism to encode distance and direction relations to capture geospatial context. In the experiment, we demonstrate that GeoLM exhibits promising capabilities in supporting toponym recognition, toponym linking, relation extraction, and geo-entity typing, which bridge the gap between natural language processing and geospatial sciences. The code is publicly available at https://github.com/knowledge-computing/geolm.

* Accepted to EMNLP23 main

Via

Access Paper or Ask Questions

Primacy Effect of ChatGPT

Oct 20, 2023

Yiwei Wang, Yujun Cai, Muhao Chen, Yuxuan Liang, Bryan Hooi

Abstract:Instruction-tuned large language models (LLMs), such as ChatGPT, have led to promising zero-shot performance in discriminative natural language understanding (NLU) tasks. This involves querying the LLM using a prompt containing the question, and the candidate labels to choose from. The question-answering capabilities of ChatGPT arise from its pre-training on large amounts of human-written text, as well as its subsequent fine-tuning on human preferences, which motivates us to ask: Does ChatGPT also inherits humans' cognitive biases? In this paper, we study the primacy effect of ChatGPT: the tendency of selecting the labels at earlier positions as the answer. We have two main findings: i) ChatGPT's decision is sensitive to the order of labels in the prompt; ii) ChatGPT has a clearly higher chance to select the labels at earlier positions as the answer. We hope that our experiments and analyses provide additional insights into building more reliable ChatGPT-based solutions. We release the source code at https://github.com/wangywUST/PrimacyEffectGPT.

* EMNLP 2023 short paper

Via

Access Paper or Ask Questions

DOMINO: A Dual-System for Multi-step Visual Language Reasoning

Oct 04, 2023

Peifang Wang, Olga Golovneva, Armen Aghajanyan, Xiang Ren, Muhao Chen, Asli Celikyilmaz, Maryam Fazel-Zarandi

Abstract:Visual language reasoning requires a system to extract text or numbers from information-dense images like charts or plots and perform logical or arithmetic reasoning to arrive at an answer. To tackle this task, existing work relies on either (1) an end-to-end vision-language model trained on a large amount of data, or (2) a two-stage pipeline where a captioning model converts the image into text that is further read by another large language model to deduce the answer. However, the former approach forces the model to answer a complex question with one single step, and the latter approach is prone to inaccurate or distracting information in the converted text that can confuse the language model. In this work, we propose a dual-system for multi-step multimodal reasoning, which consists of a "System-1" step for visual information extraction and a "System-2" step for deliberate reasoning. Given an input, System-2 breaks down the question into atomic sub-steps, each guiding System-1 to extract the information required for reasoning from the image. Experiments on chart and plot datasets show that our method with a pre-trained System-2 module performs competitively compared to prior work on in- and out-of-distribution data. By fine-tuning the System-2 module (LLaMA-2 70B) on only a small amount of data on multi-step reasoning, the accuracy of our method is further improved and surpasses the best fully-supervised end-to-end approach by 5.7% and a pipeline approach with FlanPaLM (540B) by 7.5% on a challenging dataset with human-authored questions.

Via

Access Paper or Ask Questions

AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models

Oct 03, 2023

Xiaogeng Liu, Nan Xu, Muhao Chen, Chaowei Xiao

Figure 1 for AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models

Figure 2 for AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models

Figure 3 for AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models

Figure 4 for AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models

Abstract:The aligned Large Language Models (LLMs) are powerful language understanding and decision-making tools that are created through extensive alignment with human feedback. However, these large models remain susceptible to jailbreak attacks, where adversaries manipulate prompts to elicit malicious outputs that should not be given by aligned LLMs. Investigating jailbreak prompts can lead us to delve into the limitations of LLMs and further guide us to secure them. Unfortunately, existing jailbreak techniques suffer from either (1) scalability issues, where attacks heavily rely on manual crafting of prompts, or (2) stealthiness problems, as attacks depend on token-based algorithms to generate prompts that are often semantically meaningless, making them susceptible to detection through basic perplexity testing. In light of these challenges, we intend to answer this question: Can we develop an approach that can automatically generate stealthy jailbreak prompts? In this paper, we introduce AutoDAN, a novel jailbreak attack against aligned LLMs. AutoDAN can automatically generate stealthy jailbreak prompts by the carefully designed hierarchical genetic algorithm. Extensive evaluations demonstrate that AutoDAN not only automates the process while preserving semantic meaningfulness, but also demonstrates superior attack strength in cross-model transferability, and cross-sample universality compared with the baseline. Moreover, we also compare AutoDAN with perplexity-based defense methods and show that AutoDAN can bypass them effectively.

* Pre-print, code is available at https://github.com/SheltonLiu-N/AutoDAN

Via

Access Paper or Ask Questions

Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Sep 19, 2023

Fei Wang, Kuan-Hao Huang, Kai-Wei Chang, Muhao Chen

Figure 1 for Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Figure 2 for Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Figure 3 for Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Figure 4 for Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer

Abstract:Zero-shot cross-lingual transfer is a central task in multilingual NLP, allowing models trained in languages with more sufficient training resources to generalize to other low-resource languages. Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data to improve cross-lingual transferability, which are typically expensive to obtain. In this paper, we propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer of the multilingual pretrained language models without the help of such external data. By incorporating code-switching and embedding mixup with self-augmentation, SALT effectively distills cross-lingual knowledge from the multilingual PLM and enhances its transferability on downstream tasks. Experimental results on XNLI and PAWS-X show that our method is able to improve zero-shot cross-lingual transferability without external data. Our code is available at https://github.com/luka-group/SALT.

* AACL 2023

Via

Access Paper or Ask Questions

Software Entity Recognition with Noise-Robust Learning

Aug 21, 2023

Tai Nguyen, Yifeng Di, Joohan Lee, Muhao Chen, Tianyi Zhang

Abstract:Recognizing software entities such as library names from free-form text is essential to enable many software engineering (SE) technologies, such as traceability link recovery, automated documentation, and API recommendation. While many approaches have been proposed to address this problem, they suffer from small entity vocabularies or noisy training data, hindering their ability to recognize software entities mentioned in sophisticated narratives. To address this challenge, we leverage the Wikipedia taxonomy to develop a comprehensive entity lexicon with 79K unique software entities in 12 fine-grained types, as well as a large labeled dataset of over 1.7M sentences. Then, we propose self-regularization, a noise-robust learning approach, to the training of our software entity recognition (SER) model by accounting for many dropouts. Results show that models trained with self-regularization outperform both their vanilla counterparts and state-of-the-art approaches on our Wikipedia benchmark and two Stack Overflow benchmarks. We release our models, data, and code for future research.

* ASE 2023

Via

Access Paper or Ask Questions

UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Aug 07, 2023

Wenxuan Zhou, Sheng Zhang, Yu Gu, Muhao Chen, Hoifung Poon

Figure 1 for UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Figure 2 for UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Figure 3 for UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Figure 4 for UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Abstract:Large language models (LLMs) have demonstrated remarkable generalizability, such as understanding arbitrary entities and relations. Instruction tuning has proven effective for distilling LLMs into more cost-efficient models such as Alpaca and Vicuna. Yet such student models still trail the original LLMs by large margins in downstream applications. In this paper, we explore targeted distillation with mission-focused instruction tuning to train student models that can excel in a broad application class such as open information extraction. Using named entity recognition (NER) for case study, we show how ChatGPT can be distilled into much smaller UniversalNER models for open NER. For evaluation, we assemble the largest NER benchmark to date, comprising 43 datasets across 9 diverse domains such as biomedicine, programming, social media, law, finance. Without using any direct supervision, UniversalNER attains remarkable NER accuracy across tens of thousands of entity types, outperforming general instruction-tuned models such as Alpaca and Vicuna by over 30 absolute F1 points in average. With a tiny fraction of parameters, UniversalNER not only acquires ChatGPT's capability in recognizing arbitrary entity types, but also outperforms its NER accuracy by 7-9 absolute F1 points in average. Remarkably, UniversalNER even outperforms by a large margin state-of-the-art multi-task instruction-tuned systems such as InstructUIE, which uses supervised NER examples. We also conduct thorough ablation studies to assess the impact of various components in our distillation approach. We will release the distillation recipe, data, and UniversalNER models to facilitate future research on targeted distillation.

* Project page: https://universal-ner.github.io/

Via

Access Paper or Ask Questions

Contrastive Bootstrapping for Label Refinement

Jun 07, 2023

Shudi Hou, Yu Xia, Muhao Chen, Sujian Li

Figure 1 for Contrastive Bootstrapping for Label Refinement

Figure 2 for Contrastive Bootstrapping for Label Refinement

Figure 3 for Contrastive Bootstrapping for Label Refinement

Figure 4 for Contrastive Bootstrapping for Label Refinement

Abstract:Traditional text classification typically categorizes texts into pre-defined coarse-grained classes, from which the produced models cannot handle the real-world scenario where finer categories emerge periodically for accurate services. In this work, we investigate the setting where fine-grained classification is done only using the annotation of coarse-grained categories and the coarse-to-fine mapping. We propose a lightweight contrastive clustering-based bootstrapping method to iteratively refine the labels of passages. During clustering, it pulls away negative passage-prototype pairs under the guidance of the mapping from both global and local perspectives. Experiments on NYT and 20News show that our method outperforms the state-of-the-art methods by a large margin.

* ACL 2023

Via

Access Paper or Ask Questions