Alert button
Picture for Fei Cheng

Fei Cheng

Alert button

Pushing the Limits of ChatGPT on NLP Tasks

Jun 16, 2023
Xiaofei Sun, Linfeng Dong, Xiaoya Li, Zhen Wan, Shuhe Wang, Tianwei Zhang, Jiwei Li, Fei Cheng, Lingjuan Lyu, Fei Wu, Guoyin Wang

Figure 1 for Pushing the Limits of ChatGPT on NLP Tasks
Figure 2 for Pushing the Limits of ChatGPT on NLP Tasks
Figure 3 for Pushing the Limits of ChatGPT on NLP Tasks
Figure 4 for Pushing the Limits of ChatGPT on NLP Tasks

Despite the success of ChatGPT, its performances on most NLP tasks are still well below the supervised baselines. In this work, we looked into the causes, and discovered that its subpar performance was caused by the following factors: (1) token limit in the prompt does not allow for the full utilization of the supervised datasets; (2) mismatch between the generation nature of ChatGPT and NLP tasks; (3) intrinsic pitfalls of LLMs models, e.g., hallucination, overly focus on certain keywords, etc. In this work, we propose a collection of general modules to address these issues, in an attempt to push the limits of ChatGPT on NLP tasks. Our proposed modules include (1) a one-input-multiple-prompts strategy that employs multiple prompts for one input to accommodate more demonstrations; (2) using fine-tuned models for better demonstration retrieval; (3) transforming tasks to formats that are more tailored to the generation nature; (4) employing reasoning strategies that are tailored to addressing the task-specific complexity; (5) the self-verification strategy to address the hallucination issue of LLMs; (6) the paraphrase strategy to improve the robustness of model predictions. We conduct experiments on 21 datasets of 10 representative NLP tasks, including question answering, commonsense reasoning, natural language inference, sentiment analysis, named entity recognition, entity-relation extraction, event extraction, dependency parsing, semantic role labeling, and part-of-speech tagging. Using the proposed assemble of techniques, we are able to significantly boost the performance of ChatGPT on the selected NLP tasks, achieving performances comparable to or better than supervised baselines, or even existing SOTA performances.

Viaarxiv icon

MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

May 26, 2023
Tatsuro Inaba, Hirokazu Kiyomaru, Fei Cheng, Sadao Kurohashi

Figure 1 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting
Figure 2 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting
Figure 3 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting
Figure 4 for MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting

Large language models (LLMs) have achieved impressive performance on various reasoning tasks. To further improve the performance, we propose MultiTool-CoT, a novel framework that leverages chain-of-thought (CoT) prompting to incorporate multiple external tools, such as a calculator and a knowledge retriever, during the reasoning process. We apply MultiTool-CoT to the Task 2 dataset of NumGLUE, which requires both numerical reasoning and domain-specific knowledge. The experiments show that our method significantly outperforms strong baselines and achieves state-of-the-art performance.

* ACL2023. Our code is available at https://github.com/InabaTatsuro/MultiTool-CoT 
Viaarxiv icon

Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning

May 12, 2023
Qianying Liu, Dongsheng Yang, Wenjie Zhong, Fei Cheng, Sadao Kurohashi

Figure 1 for Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning
Figure 2 for Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning
Figure 3 for Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning
Figure 4 for Comprehensive Solution Program Centric Pretraining for Table-and-Text Hybrid Numerical Reasoning

Numerical reasoning over table-and-text hybrid passages, such as financial reports, poses significant challenges and has numerous potential applications. Noise and irrelevant variables in the model input have been a hindrance to its performance. Additionally, coarse-grained supervision of the whole solution program has impeded the model's ability to learn the underlying numerical reasoning process. In this paper, we propose three pretraining tasks that operate at both the whole program and sub-program level: Variable Integrity Ranking, which guides the model to focus on useful variables; Variable Operator Prediction, which decomposes the supervision into fine-grained single operator prediction; and Variable Keyphrase Masking, which encourages the model to identify key evidence that sub-programs are derived from. Experimental results demonstrate the effectiveness of our proposed methods, surpassing transformer-based model baselines.

* 11 pages 
Viaarxiv icon

GPT-RE: In-context Learning for Relation Extraction using Large Language Models

May 03, 2023
Zhen Wan, Fei Cheng, Zhuoyuan Mao, Qianying Liu, Haiyue Song, Jiwei Li, Sadao Kurohashi

Figure 1 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Figure 2 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Figure 3 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models
Figure 4 for GPT-RE: In-context Learning for Relation Extraction using Large Language Models

In spite of the potential for ground-breaking achievements offered by large language models (LLMs) (e.g., GPT-3), they still lag significantly behind fully-supervised baselines (e.g., fine-tuned BERT) in relation extraction (RE). This is due to the two major shortcomings of LLMs in RE: (1) low relevance regarding entity and relation in retrieved demonstrations for in-context learning; and (2) the strong inclination to wrongly classify NULL examples into other pre-defined labels. In this paper, we propose GPT-RE to bridge the gap between LLMs and fully-supervised baselines. GPT-RE successfully addresses the aforementioned issues by (1) incorporating task-specific entity representations in demonstration retrieval; and (2) enriching the demonstrations with gold label-induced reasoning logic. We evaluate GPT-RE on four widely-used RE datasets, and observe that GPT-RE achieves improvements over not only existing GPT-3 baselines, but also fully-supervised baselines. Specifically, GPT-RE achieves SOTA performances on the Semeval and SciERC datasets, and competitive performances on the TACRED and ACE05 datasets.

Viaarxiv icon

Textual Enhanced Contrastive Learning for Solving Math Word Problems

Nov 29, 2022
Yibin Shen, Qianying Liu, Zhuoyuan Mao, Fei Cheng, Sadao Kurohashi

Figure 1 for Textual Enhanced Contrastive Learning for Solving Math Word Problems
Figure 2 for Textual Enhanced Contrastive Learning for Solving Math Word Problems
Figure 3 for Textual Enhanced Contrastive Learning for Solving Math Word Problems
Figure 4 for Textual Enhanced Contrastive Learning for Solving Math Word Problems

Solving math word problems is the task that analyses the relation of quantities and requires an accurate understanding of contextual natural language information. Recent studies show that current models rely on shallow heuristics to predict solutions and could be easily misled by small textual perturbations. To address this problem, we propose a Textual Enhanced Contrastive Learning framework, which enforces the models to distinguish semantically similar examples while holding different mathematical logic. We adopt a self-supervised manner strategy to enrich examples with subtle textual variance by textual reordering or problem re-construction. We then retrieve the hardest to differentiate samples from both equation and textual perspectives and guide the model to learn their representations. Experimental results show that our method achieves state-of-the-art on both widely used benchmark datasets and also exquisitely designed challenge datasets in English and Chinese. \footnote{Our code and data is available at \url{https://github.com/yiyunya/Textual_CL_MWP}

* Findings of EMNLP 2022 
Viaarxiv icon

ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision

Oct 13, 2022
Qianying Liu, Wenyu Guan, Jianhao Shen, Fei Cheng, Sadao Kurohashi

Figure 1 for ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision
Figure 2 for ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision
Figure 3 for ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision
Figure 4 for ComSearch: Equation Searching with Combinatorial Strategy for Solving Math Word Problems with Weak Supervision

Previous studies have introduced a weakly-supervised paradigm for solving math word problems requiring only the answer value annotation. While these methods search for correct value equation candidates as pseudo labels, they search among a narrow sub-space of the enormous equation space. To address this problem, we propose a novel search algorithm with combinatorial strategy \textbf{ComSearch}, which can compress the search space by excluding mathematically equivalent equations. The compression allows the searching algorithm to enumerate all possible equations and obtain high-quality data. We investigate the noise in the pseudo labels that hold wrong mathematical logic, which we refer to as the \textit{false-matching} problem, and propose a ranking model to denoise the pseudo labels. Our approach holds a flexible framework to utilize two existing supervised math word problem solvers to train pseudo labels, and both achieve state-of-the-art performance in the weak supervision task.

* 13 pages 
Viaarxiv icon

Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems

Sep 21, 2022
Yibin Shen, Qianying Liu, Zhuoyuan Mao, Zhen Wan, Fei Cheng, Sadao Kurohashi

Figure 1 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems
Figure 2 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems
Figure 3 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems
Figure 4 for Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems

To solve Math Word Problems, human students leverage diverse reasoning logic that reaches different possible equation solutions. However, the mainstream sequence-to-sequence approach of automatic solvers aims to decode a fixed solution equation supervised by human annotation. In this paper, we propose a controlled equation generation solver by leveraging a set of control codes to guide the model to consider certain reasoning logic and decode the corresponding equations expressions transformed from the human reference. The empirical results suggest that our method universally improves the performance on single-unknown (Math23K) and multiple-unknown (DRAW1K, HMWP) benchmarks, with substantial improvements up to 13.2% accuracy on the challenging multiple-unknown datasets.

* AACL 2022 short paper 
Viaarxiv icon

Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision

May 18, 2022
Zhen Wan, Fei Cheng, Qianying Liu, Zhuoyuan Mao, Haiyue Song, Sadao Kurohashi

Figure 1 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
Figure 2 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
Figure 3 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision
Figure 4 for Relation Extraction with Weighted Contrastive Pre-training on Distant Supervision

Contrastive pre-training on distant supervision has shown remarkable effectiveness for improving supervised relation extraction tasks. However, the existing methods ignore the intrinsic noise of distant supervision during the pre-training stage. In this paper, we propose a weighted contrastive learning method by leveraging the supervised data to estimate the reliability of pre-training instances and explicitly reduce the effect of noise. Experimental results on three supervised datasets demonstrate the advantages of our proposed weighted contrastive learning approach, compared to two state-of-the-art non-weighted baselines.

* Under review 
Viaarxiv icon

Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Apr 08, 2022
Qianying Liu, Yuhang Yang, Zhuo Gong, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Sadao Kurohashi

Figure 1 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 2 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 3 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition
Figure 4 for Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition

Low resource speech recognition has been long-suffering from insufficient training data. While neighbour languages are often used as assistant training data, it would be difficult for the model to induct similar units (character, subword, etc.) across the languages. In this paper, we assume similar units in neighbour language share similar term frequency and form a Huffman tree to perform multi-lingual hierarchical Softmax decoding. During decoding, the hierarchical structure can benefit the training of low-resource languages. Experimental results show the effectiveness of our method.

* 5 pages, Interspeech submission 
Viaarxiv icon

Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding

Nov 10, 2021
Qianying Liu, Fei Cheng, Sadao Kurohashi

Figure 1 for Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding
Figure 2 for Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding
Figure 3 for Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding
Figure 4 for Cross-lingual Adaption Model-Agnostic Meta-Learning for Natural Language Understanding

Meta learning with auxiliary languages has demonstrated promising improvements for cross-lingual natural language processing. However, previous studies sample the meta-training and meta-testing data from the same language, which limits the ability of the model for cross-lingual transfer. In this paper, we propose XLA-MAML, which performs direct cross-lingual adaption in the meta-learning stage. We conduct zero-shot and few-shot experiments on Natural Language Inference and Question Answering. The experimental results demonstrate the effectiveness of our method across different languages, tasks, and pretrained models. We also give analysis on various cross-lingual specific settings for meta-learning including sampling strategy and parallelism.

* 11 pages 
Viaarxiv icon