Alert button
Picture for Houfeng Wang

Houfeng Wang

Alert button

Preference Ranking Optimization for Human Alignment

Jun 30, 2023
Feifan Song, Bowen Yu, Minghao Li, Haiyang Yu, Fei Huang, Yongbin Li, Houfeng Wang

Figure 1 for Preference Ranking Optimization for Human Alignment
Figure 2 for Preference Ranking Optimization for Human Alignment
Figure 3 for Preference Ranking Optimization for Human Alignment
Figure 4 for Preference Ranking Optimization for Human Alignment

Large language models (LLMs) often contain misleading content, emphasizing the need to align them with human values to ensure secur AI systems. Reinforcement learning from human feedback (RLHF) has been employed to achieve this alignment by combining a reward model, typically based on Bradley-Terry paired comparison, with an RL algorithm such as Proximal Policy Optimization (PPO) to optimize LLM responses. However, RLHF exhibits complexity, instability, and sensitivity to hyperparameters. In this paper, we propose Preference Ranking Optimization (PRO) as an alternative to PPO for directly aligning LLMs with the Bradley-Terry comparison. PRO extends the pairwise Bradley-Terry comparison to accommodate preference rankings of any length. By iteratively contrasting the likelihood of generating responses, PRO instructs the LLM to prioritize the best response while progressively ranking the remaining responses. In this manner, PRO effectively transforms human alignment into aligning the probability ranking of $n$ responses generated by LLM with the preference ranking of humans towards these responses. Experiments have shown that PRO outperforms existing alignment algorithms, achieving comparable results to ChatGPT and human responses through automatic-based, reward-based, GPT-4, and human evaluations. Furthermore, we demonstrate that longer, more diverse, and higher-quality preference ranking sequences can consistently enhance the performance of human alignment.

Viaarxiv icon

Semiparametric Language Models Are Scalable Continual Learners

Mar 02, 2023
Guangyue Peng, Tao Ge, Si-Qing Chen, Furu Wei, Houfeng Wang

Figure 1 for Semiparametric Language Models Are Scalable Continual Learners
Figure 2 for Semiparametric Language Models Are Scalable Continual Learners
Figure 3 for Semiparametric Language Models Are Scalable Continual Learners
Figure 4 for Semiparametric Language Models Are Scalable Continual Learners

Semiparametric language models (LMs) have shown promise in continuously learning from new text data by combining a parameterized neural LM with a growable non-parametric memory for memorizing new content. However, conventional semiparametric LMs will finally become prohibitive for computing and storing if they are applied to continual learning over streaming data, because the non-parametric memory grows linearly with the amount of data they learn from over time. To address the issue of scalability, we present a simple and intuitive approach called Selective Memorization (SeMem), which only memorizes difficult samples that the model is likely to struggle with. We demonstrate that SeMem improves the scalability of semiparametric LMs for continual learning over streaming data in two ways: (1) data-wise scalability: as the model becomes stronger through continual learning, it will encounter fewer difficult cases that need to be memorized, causing the growth of the non-parametric memory to slow down over time rather than growing at a linear rate with the size of training data; (2) model-wise scalability: SeMem allows a larger model to memorize fewer samples than its smaller counterpart because it is rarer for a larger model to encounter incomprehensible cases, resulting in a non-parametric memory that does not scale linearly with model size. We conduct extensive experiments in language modeling and downstream tasks to test SeMem's results, showing SeMem enables a semiparametric LM to be a scalable continual learner with little forgetting.

* Work in progress 
Viaarxiv icon

A Unified Framework for Multi-intent Spoken Language Understanding with prompting

Oct 07, 2022
Feifan Song, Lianzhe Huang, Houfeng Wang

Figure 1 for A Unified Framework for Multi-intent Spoken Language Understanding with prompting
Figure 2 for A Unified Framework for Multi-intent Spoken Language Understanding with prompting
Figure 3 for A Unified Framework for Multi-intent Spoken Language Understanding with prompting
Figure 4 for A Unified Framework for Multi-intent Spoken Language Understanding with prompting

Multi-intent Spoken Language Understanding has great potential for widespread implementation. Jointly modeling Intent Detection and Slot Filling in it provides a channel to exploit the correlation between intents and slots. However, current approaches are apt to formulate these two sub-tasks differently, which leads to two issues: 1) It hinders models from effective extraction of shared features. 2) Pretty complicated structures are involved to enhance expression ability while causing damage to the interpretability of frameworks. In this work, we describe a Prompt-based Spoken Language Understanding (PromptSLU) framework, to intuitively unify two sub-tasks into the same form by offering a common pre-trained Seq2Seq model. In detail, ID and SF are completed by concisely filling the utterance into task-specific prompt templates as input, and sharing output formats of key-value pairs sequence. Furthermore, variable intents are predicted first, then naturally embedded into prompts to guide slot-value pairs inference from a semantic perspective. Finally, we are inspired by prevalent multi-task learning to introduce an auxiliary sub-task, which helps to learn relationships among provided labels. Experiment results show that our framework outperforms several state-of-the-art baselines on two public datasets.

* Work in progress 
Viaarxiv icon

HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification

Apr 28, 2022
Zihan Wang, Peiyi Wang, Tianyu Liu, Yunbo Cao, Zhifang Sui, Houfeng Wang

Figure 1 for HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
Figure 2 for HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
Figure 3 for HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification
Figure 4 for HPT: Hierarchy-aware Prompt Tuning for Hierarchical Text Classification

Hierarchical text classification (HTC) is a challenging subtask of multi-label classification due to its complex label hierarchy. Recently, the pretrained language models (PLM) have been widely adopted in HTC through a fine-tuning paradigm. However, in this paradigm, there exists a huge gap between the classification tasks with sophisticated label hierarchy and the masked language model (MLM) pretraining tasks of PLMs and thus the potentials of PLMs can not be fully tapped. To bridge the gap, in this paper, we propose HPT, a Hierarchy-aware Prompt Tuning method to handle HTC from a multi-label MLM perspective. Specifically, we construct dynamic virtual template and label words which take the form of soft prompts to fuse the label hierarchy knowledge and introduce a zero-bounded multi-label cross entropy loss to harmonize the objectives of HTC and MLM. Extensive experiments show HPT achieves the state-of-the-art performances on 3 popular HTC datasets and is adept at handling the imbalance and low resource situations.

* Work in progress. First two authors contribute equally 
Viaarxiv icon

Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification

Mar 23, 2022
Zihan Wang, Peiyi Wang, Lianzhe Huang, Xin Sun, Houfeng Wang

Figure 1 for Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification
Figure 2 for Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification
Figure 3 for Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification
Figure 4 for Incorporating Hierarchy into Text Encoder: a Contrastive Learning Approach for Hierarchical Text Classification

Hierarchical text classification is a challenging subtask of multi-label classification due to its complex label hierarchy. Existing methods encode text and label hierarchy separately and mix their representations for classification, where the hierarchy remains unchanged for all input text. Instead of modeling them separately, in this work, we propose Hierarchy-guided Contrastive Learning (HGCLR) to directly embed the hierarchy into a text encoder. During training, HGCLR constructs positive samples for input text under the guidance of the label hierarchy. By pulling together the input text and its positive sample, the text encoder can learn to generate the hierarchy-aware text representation independently. Therefore, after training, the HGCLR enhanced text encoder can dispense with the redundant hierarchy. Extensive experiments on three benchmark datasets verify the effectiveness of HGCLR.

* ACL 2022 main conference 
Viaarxiv icon

Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss

Mar 17, 2022
Yantao Gong, Cao Liu, Fan Yang, Xunliang Cai, Guanglu Wan, Jiansong Chen, Weipeng Zhang, Houfeng Wang

Figure 1 for Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss
Figure 2 for Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss
Figure 3 for Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss
Figure 4 for Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss

Data-driven methods have achieved notable performance on intent detection, which is a task to comprehend user queries. Nonetheless, they are controversial for over-confident predictions. In some scenarios, users do not only care about the accuracy but also the confidence of model. Unfortunately, mainstream neural networks are poorly calibrated, with a large gap between accuracy and confidence. To handle this problem defined as confidence calibration, we propose a model using the hyperspherical space and rebalanced accuracy-uncertainty loss. Specifically, we project the label vector onto hyperspherical space uniformly to generate a dense label representation matrix, which mitigates over-confident predictions due to overfitting sparce one-hot label matrix. Besides, we rebalance samples of different accuracy and uncertainty to better guide model training. Experiments on the open datasets verify that our model outperforms the existing calibration methods and achieves a significant improvement on the calibration metric.

Viaarxiv icon

Using calibrator to improve robustness in Machine Reading Comprehension

Feb 24, 2022
Jing Jin, Houfeng Wang

Figure 1 for Using calibrator to improve robustness in Machine Reading Comprehension
Figure 2 for Using calibrator to improve robustness in Machine Reading Comprehension
Figure 3 for Using calibrator to improve robustness in Machine Reading Comprehension
Figure 4 for Using calibrator to improve robustness in Machine Reading Comprehension

Machine Reading Comprehension(MRC) has achieved a remarkable result since some powerful models, such as BERT, are proposed. However, these models are not robust enough and vulnerable to adversarial input perturbation and generalization examples. Some works tried to improve the performance on specific types of data by adding some related examples into training data while it leads to degradation on the original dataset, because the shift of data distribution makes the answer ranking based on the softmax probability of model unreliable. In this paper, we propose a method to improve the robustness by using a calibrator as the post-hoc reranker, which is implemented based on XGBoost model. The calibrator combines both manual features and representation learning features to rerank candidate results. Experimental results on adversarial datasets show that our model can achieve performance improvement by more than 10\% and also make improvement on the original and generalization datasets.

Viaarxiv icon

Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

Feb 23, 2022
Lianzhe Huang, Shuming Ma, Dongdong Zhang, Furu Wei, Houfeng Wang

Figure 1 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 2 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 3 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt
Figure 4 for Zero-shot Cross-lingual Transfer of Prompt-based Tuning with a Unified Multilingual Prompt

Prompt-based tuning has been proven effective for pretrained language models (PLMs). While most of the existing work focuses on the monolingual prompts, we study the multilingual prompts for multilingual PLMs, especially in the zero-shot cross-lingual setting. To alleviate the effort of designing different prompts for multiple languages, we propose a novel model that uses a unified prompt for all languages, called UniPrompt. Different from the discrete prompts and soft prompts, the unified prompt is model-based and language-agnostic. Specifically, the unified prompt is initialized by a multilingual PLM to produce language-independent representation, after which is fused with the text input. During inference, the prompts can be pre-computed so that no extra computation cost is needed. To collocate with the unified prompt, we propose a new initialization method for the target label word to further improve the model's transferability across languages. Extensive experiments show that our proposed methods can significantly outperform the strong baselines across different languages. We will release data and code to facilitate future research.

Viaarxiv icon

A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model

Jan 26, 2022
Xin Sun, Tao Ge, Shuming Ma, Jingjing Li, Furu Wei, Houfeng Wang

Figure 1 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 2 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 3 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model
Figure 4 for A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model

Synthetic data construction of Grammatical Error Correction (GEC) for non-English languages relies heavily on human-designed and language-specific rules, which produce limited error-corrected patterns. In this paper, we propose a generic and language-independent strategy for multilingual GEC, which can train a GEC system effectively for a new non-English language with only two easy-to-access resources: 1) a pretrained cross-lingual language model (PXLM) and 2) parallel translation data between English and the language. Our approach creates diverse parallel GEC data without any language-specific operations by taking the non-autoregressive translation generated by PXLM and the gold translation as error-corrected sentence pairs. Then, we reuse PXLM to initialize the GEC model and pretrain it with the synthetic data generated by itself, which yields further improvement. We evaluate our approach on three public benchmarks of GEC in different languages. It achieves the state-of-the-art results on the NLPCC 2018 Task 2 dataset (Chinese) and obtains competitive performance on Falko-Merlin (German) and RULEC-GEC (Russian). Further analysis demonstrates that our data construction method is complementary to rule-based approaches.

Viaarxiv icon