Alert button
Picture for Zhen Wan

Zhen Wan

Alert button

Pushing the Limits of ChatGPT on NLP Tasks

Jun 16, 2023
Xiaofei Sun, Linfeng Dong, Xiaoya Li, Zhen Wan, Shuhe Wang, Tianwei Zhang, Jiwei Li, Fei Cheng, Lingjuan Lyu, Fei Wu, Guoyin Wang

Figure 1 for Pushing the Limits of ChatGPT on NLP Tasks
Figure 2 for Pushing the Limits of ChatGPT on NLP Tasks
Figure 3 for Pushing the Limits of ChatGPT on NLP Tasks
Figure 4 for Pushing the Limits of ChatGPT on NLP Tasks

Despite the success of ChatGPT, its performances on most NLP tasks are still well below the supervised baselines. In this work, we looked into the causes, and discovered that its subpar performance was caused by the following factors: (1) token limit in the prompt does not allow for the full utilization of the supervised datasets; (2) mismatch between the generation nature of ChatGPT and NLP tasks; (3) intrinsic pitfalls of LLMs models, e.g., hallucination, overly focus on certain keywords, etc. In this work, we propose a collection of general modules to address these issues, in an attempt to push the limits of ChatGPT on NLP tasks. Our proposed modules include (1) a one-input-multiple-prompts strategy that employs multiple prompts for one input to accommodate more demonstrations; (2) using fine-tuned models for better demonstration retrieval; (3) transforming tasks to formats that are more tailored to the generation nature; (4) employing reasoning strategies that are tailored to addressing the task-specific complexity; (5) the self-verification strategy to address the hallucination issue of LLMs; (6) the paraphrase strategy to improve the robustness of model predictions. We conduct experiments on 21 datasets of 10 representative NLP tasks, including question answering, commonsense reasoning, natural language inference, sentiment analysis, named entity recognition, entity-relation extraction, event extraction, dependency parsing, semantic role labeling, and part-of-speech tagging. Using the proposed assemble of techniques, we are able to significantly boost the performance of ChatGPT on the selected NLP tasks, achieving performances comparable to or better than supervised baselines, or even existing SOTA performances.

Viaarxiv icon

GPT-RE: In-context Learning for Relation Extraction using Large Language Models

May 03, 2023
Zhen Wan, Fei Cheng, Zhuoyuan Mao, Qianying Liu, Haiyue Song, Jiwei Li, Sadao Kurohashi

Figure 1 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Figure 2 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Figure 3 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Figure 4 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models

In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e.g., GPT-3), they still lag significantly behind fully-supervised baselines (e.g., fine-tuned BERT) in relation extraction (RE). This is due to the two major shortcomings of LLMs in RE: (1) low relevance regarding entity and relation in retrieved demonstrations for in-context learning; and (2) the strong inclination to wrongly classify NULL examples into other pre-defined labels. In this paper, we propose GPT-RE to bridge the gap between LLMs and fully-supervised baselines. GPT-RE successfully addresses the aforementioned issues by (1) incorporating task-specific entity representations in demonstration retrieval; and (2) enriching the demonstrations with gold label-induced reasoning logic. We evaluate GPT-RE on four widely-used RE datasets, and observe that GPT-RE achieves improvements over not only existing GPT-3 baselines, but also fully-supervised baselines. Specifically, GPT-RE achieves SOTA performances on the Semeval and SciERC datasets, and competitive performances on the TACRED and ACE05 datasets.

Viaarxiv icon

Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems

Sep 21, 2022
Yibin Shen, Qianying Liu, Zhuoyuan Mao, Zhen Wan, Fei Cheng, Sadao Kurohashi

Figure 1 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems
Figure 2 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems
Figure 3 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems
Figure 4 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems

To solve Math Word Problems, human students leverage diverse reasoning logic that reaches different possible equation solutions. However, the mainstream sequence-to-sequence approach of automatic solvers aims to decode a fixed solution equation supervised by human annotation. In this paper, we propose a controlled equation generation solver by leveraging a set of control codes to guide the model to consider certain reasoning logic and decode the corresponding equations expressions transformed from the human reference. The empirical results suggest that our method universally improves the performance on single-unknown (Math23K) and multiple-unknown (DRAW1K, HMWP) benchmarks, with substantial improvements up to 13.2% accuracy on the challenging multiple-unknown datasets.

* AACL 2022 short paper 
Viaarxiv icon

Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision

May 18, 2022
Zhen Wan, Fei Cheng, Qianying Liu, Zhuoyuan Mao, Haiyue Song, Sadao Kurohashi

Figure 1 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
Figure 2 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
Figure 3 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
Figure 4 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision

Contrastive pre-training on distant supervision has shown remarkable effectiveness for improving supervised relation extraction tasks. However, the existing methods ignore the intrinsic noise of distant supervision during the pre-training stage. In this paper, we propose a weighted contrastive learning method by leveraging the supervised data to estimate the reliability of pre-training instances and explicitly reduce the effect of noise. Experimental results on three supervised datasets demonstrate the advantages of our proposed weighted contrastive learning approach, compared to two state-of-the-art non-weighted baselines.

* Under review 
Viaarxiv icon

When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?

Apr 26, 2022
Zhuoyuan Mao, Chenhui Chu, Raj Dabre, Haiyue Song, Zhen Wan, Sadao Kurohashi

Figure 1 for When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?
Figure 2 for When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?
Figure 3 for When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?
Figure 4 for When do Contrastive Word Alignments Improve Many-to-many Neural Machine Translation?

Word alignment has proven to benefit many-to-many neural machine translation (NMT). However, high-quality ground-truth bilingual dictionaries were used for pre-editing in previous methods, which are unavailable for most language pairs. Meanwhile, the contrastive objective can implicitly utilize automatically learned word alignment, which has not been explored in many-to-many NMT. This work proposes a word-level contrastive objective to leverage word alignments for many-to-many NMT. Empirical results show that this leads to 0.8 BLEU gains for several language pairs. Analyses reveal that in many-to-many NMT, the encoder's sentence retrieval performance highly correlates with the translation quality, which explains when the proposed method impacts translation. This motivates future exploration for many-to-many NMT to improve the encoder's sentence retrieval performance.

* NAACL 2022 findings 
Viaarxiv icon