Alert button
Picture for Zezhong Wang

Zezhong Wang

Alert button

Self-Guard: Empower the LLM to Safeguard Itself

Oct 24, 2023
Zezhong Wang, Fangkai Yang, Lu Wang, Pu Zhao, Hongru Wang, Liang Chen, Qingwei Lin, Kam-Fai Wong

The jailbreak attack can bypass the safety measures of a Large Language Model (LLM), generating harmful content. This misuse of LLM has led to negative societal consequences. Currently, there are two main approaches to address jailbreak attacks: safety training and safeguards. Safety training focuses on further training LLM to enhance its safety. On the other hand, safeguards involve implementing external models or filters to prevent harmful outputs. However, safety training has constraints in its ability to adapt to new attack types and often leads to a drop in model performance. Safeguards have proven to be of limited help. To tackle these issues, we propose a novel approach called Self-Guard, which combines the strengths of both safety methods. Self-Guard includes two stages. In the first stage, we enhance the model's ability to assess harmful content, and in the second stage, we instruct the model to consistently perform harmful content detection on its own responses. The experiment has demonstrated that Self-Guard is robust against jailbreak attacks. In the bad case analysis, we find that LLM occasionally provides harmless responses to harmful queries. Additionally, we evaluated the general capabilities of the LLM before and after safety training, providing evidence that Self-Guard does not result in the LLM's performance degradation. In sensitivity tests, Self-Guard not only avoids inducing over-sensitivity in LLM but also can even mitigate this issue.

Viaarxiv icon

JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning

Sep 01, 2023
Wai-Chung Kwan, Huimin Wang, Hongru Wang, Zezhong Wang, Xian Wu, Yefeng Zheng, Kam-Fai Wong

Figure 1 for JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning
Figure 2 for JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning
Figure 3 for JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning
Figure 4 for JoTR: A Joint Transformer and Reinforcement Learning Framework for Dialog Policy Learning

Dialogue policy learning (DPL) is a crucial component of dialogue modelling. Its primary role is to determine the appropriate abstract response, commonly referred to as the "dialogue action". Traditional DPL methodologies have treated this as a sequential decision problem, using pre-defined action candidates extracted from a corpus. However, these incomplete candidates can significantly limit the diversity of responses and pose challenges when dealing with edge cases, which are scenarios that occur only at extreme operating parameters. To address these limitations, we introduce a novel framework, JoTR. This framework is unique as it leverages a text-to-text Transformer-based model to generate flexible dialogue actions. Unlike traditional methods, JoTR formulates a word-level policy that allows for a more dynamic and adaptable dialogue action generation, without the need for any action templates. This setting enhances the diversity of responses and improves the system's ability to handle edge cases effectively. In addition, JoTR employs reinforcement learning with a reward-shaping mechanism to efficiently finetune the word-level dialogue policy, which allows the model to learn from its interactions, improving its performance over time. We conducted an extensive evaluation of JoTR to assess its effectiveness. Our extensive evaluation shows that JoTR achieves state-of-the-art performance on two benchmark dialogue modelling tasks, as assessed by both user simulators and human evaluators.

* Our code, models and other related resources are publicly available at https://github.com/KwanWaiChung/JoTR 
Viaarxiv icon

Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization

May 22, 2023
Liang Chen, Hongru Wang, Yang Deng, Wai-Chung Kwan, Zezhong Wang, Kam-Fai Wong

Figure 1 for Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization
Figure 2 for Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization
Figure 3 for Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization
Figure 4 for Towards Robust Personalized Dialogue Generation via Order-Insensitive Representation Regularization

Generating persona consistent dialogue response is important for developing an intelligent conversational agent. Recent works typically fine-tune large-scale pre-trained models on this task by concatenating persona texts and dialogue history as a single input sequence to generate the target response. While simple and effective, our analysis shows that this popular practice is seriously affected by order sensitivity where different input orders of persona sentences significantly impact the quality and consistency of generated response, resulting in severe performance fluctuations (i.e., 29.4% on GPT2 and 83.2% on BART). To mitigate the order sensitivity problem, we propose a model-agnostic framework, ORder Insensitive Generation (ORIG), which enables dialogue models to learn robust representation under different persona orders and improve the consistency of response generation. Experiments on the Persona-Chat dataset justify the effectiveness and superiority of our method with two dominant pre-trained models (GPT2 and BART).

* ACL 2023 
Viaarxiv icon

Chain-of-thought prompting for responding to in-depth dialogue questions with LLM

May 19, 2023
Hongru Wang, Rui Wang, Fei Mi, Zezhong Wang, Ruifeng Xu, Kam-Fai Wong

Figure 1 for Chain-of-thought prompting for responding to in-depth dialogue questions with LLM
Figure 2 for Chain-of-thought prompting for responding to in-depth dialogue questions with LLM
Figure 3 for Chain-of-thought prompting for responding to in-depth dialogue questions with LLM
Figure 4 for Chain-of-thought prompting for responding to in-depth dialogue questions with LLM

The way and content in which users ask questions can provide insight into their current status, including their personality, emotions, and psychology. Instead of directly prompting the large language models (LLMs), we explore how chain-of-thought prompting helps in this scenario to perform reasoning and planning according to user status, aiming to provide a more personalized and engaging experience for the user query. To this end, we first construct a benchmark of 6 dialogue or question-answering datasets in both English and Chinese, covering 3 different aspects of user status (\textit{including} \textit{personality}, \textit{emotion}, and \textit{psychology}). Then we prompt the LLMs to generate the response regarding the user status as intermediate reasoning processing. We propose a novel demonstration selection strategy using the semantic similarity of intermediate reasoning instead of test queries. To evaluate the effectiveness and robustness of our approach, we conduct extensive experiments with 7 LLMs under zero-shot and one-shot settings. The experimental results show that our approach consistently outperforms standard prompting in terms of both \textit{helpfulness} and \textit{acceptness} across all datasets, regardless of the LLMs used. The code and dataset can be found at \url{https://github.com/ruleGreen/Dialogue\_CoT.git}.

Viaarxiv icon

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

May 19, 2023
Zezhong Wang, Fangkai Yang, Pu Zhao, Lu Wang, Jue Zhang, Mohit Garg, Qingwei Lin, Dongmei Zhang

Figure 1 for Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering
Figure 2 for Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering
Figure 3 for Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering
Figure 4 for Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average since there is no specific knowledge in it. This issue has attracted widespread attention, but there are few relevant benchmarks available. In this paper, we provide a benchmark Question Answering (QA) dataset named MSQA, which is about Microsoft products and IT technical problems encountered by customers. This dataset contains industry cloud-specific QA knowledge, which is not available for general LLM, so it is well suited for evaluating methods aimed at improving domain-specific capabilities of LLM. In addition, we propose a new model interaction paradigm that can empower LLM to achieve better performance on domain-specific tasks where it is not proficient. Extensive experiments demonstrate that the approach following our model fusion framework outperforms the commonly used LLM with retrieval methods.

* 13 pages, 1 figure 
Viaarxiv icon

Testability-Aware Low Power Controller Design with Evolutionary Learning

Nov 26, 2021
Min Li, Zhengyuan Shi, Zezhong Wang, Weiwei Zhang, Yu Huang, Qiang Xu

Figure 1 for Testability-Aware Low Power Controller Design with Evolutionary Learning
Figure 2 for Testability-Aware Low Power Controller Design with Evolutionary Learning
Figure 3 for Testability-Aware Low Power Controller Design with Evolutionary Learning
Figure 4 for Testability-Aware Low Power Controller Design with Evolutionary Learning

XORNet-based low power controller is a popular technique to reduce circuit transitions in scan-based testing. However, existing solutions construct the XORNet evenly for scan chain control, and it may result in sub-optimal solutions without any design guidance. In this paper, we propose a novel testability-aware low power controller with evolutionary learning. The XORNet generated from the proposed genetic algorithm (GA) enables adaptive control for scan chains according to their usages, thereby significantly improving XORNet encoding capacity, reducing the number of failure cases with ATPG and decreasing test data volume. Experimental results indicate that under the same control bits, our GA-guided XORNet design can improve the fault coverage by up to 2.11%. The proposed GA-guided XORNets also allows reducing the number of control bits, and the total testing time decreases by 20.78% on average and up to 47.09% compared to the existing design without sacrificing test coverage.

* Accepted by ITC 2021 (short paper). This is the long paper version. Code is available on https://github.com/lee-man/ga-testing 
Viaarxiv icon

Integrating Pretrained Language Model for Dialogue Policy Learning

Nov 02, 2021
Hongru Wang, Huimin Wang, Zezhong Wang, Kam-Fai Wong

Figure 1 for Integrating Pretrained Language Model for Dialogue Policy Learning
Figure 2 for Integrating Pretrained Language Model for Dialogue Policy Learning
Figure 3 for Integrating Pretrained Language Model for Dialogue Policy Learning
Figure 4 for Integrating Pretrained Language Model for Dialogue Policy Learning

Reinforcement Learning (RL) has been witnessed its potential for training a dialogue policy agent towards maximizing the accumulated rewards given from users. However, the reward can be very sparse for it is usually only provided at the end of a dialog session, which causes unaffordable interaction requirements for an acceptable dialog agent. Distinguished from many efforts dedicated to optimizing the policy and recovering the reward alternatively which suffers from easily getting stuck in local optima and model collapse, we decompose the adversarial training into two steps: 1) we integrate a pre-trained language model as a discriminator to judge whether the current system action is good enough for the last user action (i.e., \textit{next action prediction}); 2) the discriminator gives and extra local dense reward to guide the agent's exploration. The experimental result demonstrates that our method significantly improves the complete rate (~4.4\%) and success rate (~8.0\%) of the dialogue system.

Viaarxiv icon

Prior Omission of Dissimilar Source Domain(s) for Cost-Effective Few-Shot Learning

Sep 11, 2021
Zezhong Wang, Hongru Wang, Kwan Wai Chung, Jia Zhu, Gabriel Pui Cheong Fung, Kam-Fai Wong

Figure 1 for Prior Omission of Dissimilar Source Domain(s) for Cost-Effective Few-Shot Learning
Figure 2 for Prior Omission of Dissimilar Source Domain(s) for Cost-Effective Few-Shot Learning
Figure 3 for Prior Omission of Dissimilar Source Domain(s) for Cost-Effective Few-Shot Learning
Figure 4 for Prior Omission of Dissimilar Source Domain(s) for Cost-Effective Few-Shot Learning

Few-shot slot tagging is an emerging research topic in the field of Natural Language Understanding (NLU). With sufficient annotated data from source domains, the key challenge is how to train and adapt the model to another target domain which only has few labels. Conventional few-shot approaches use all the data from the source domains without considering inter-domain relations and implicitly assume each sample in the domain contributes equally. However, our experiments show that the data distribution bias among different domains will significantly affect the adaption performance. Moreover, transferring knowledge from dissimilar domains will even introduce some extra noises so that affect the performance of models. To tackle this problem, we propose an effective similarity-based method to select data from the source domains. In addition, we propose a Shared-Private Network (SP-Net) for the few-shot slot tagging task. The words from the same class would have some shared features. We extract those shared features from the limited annotated data on the target domain and merge them together as the label embedding to help us predict other unlabelled data on the target domain. The experiment shows that our method outperforms the state-of-the-art approaches with fewer source data. The result also proves that some training data from dissimilar sources are redundant and even negative for the adaption.

Viaarxiv icon

MCML: A Novel Memory-based Contrastive Meta-Learning Method for Few Shot Slot Tagging

Aug 28, 2021
Hongru Wang, Zezhong Wang, Gabriel Pui Cheong Fung, Kam-Fai Wong

Figure 1 for MCML: A Novel Memory-based Contrastive Meta-Learning Method for Few Shot Slot Tagging
Figure 2 for MCML: A Novel Memory-based Contrastive Meta-Learning Method for Few Shot Slot Tagging
Figure 3 for MCML: A Novel Memory-based Contrastive Meta-Learning Method for Few Shot Slot Tagging
Figure 4 for MCML: A Novel Memory-based Contrastive Meta-Learning Method for Few Shot Slot Tagging

Meta-learning is widely used for few-shot slot tagging in the task of few-shot learning. The performance of existing methods is, however, seriously affected by catastrophic forgetting. This phenomenon is common in deep learning as the training and testing modules fail to take into account historical information, i.e. previously trained episodes in the metric-based meta-learning. To overcome this predicament, we propose the Memory-based Contrastive Meta-learning (MCML) method. Specifically, we propose a learn-from-memory mechanism that use explicit memory to keep track of the label representations of previously trained episodes and propose a contrastive learning method to compare the current label embedded in the few shot episode with the historic ones stored in the memory, and an adaption-from memory mechanism to determine the output label based on the contrast between the input labels embedded in the test episode and the label clusters in the memory. Experimental results show that MCML is scalable and outperforms metric-based meta-learning and optimization-based meta-learning on all 1shot, 5-shot, 10-shot, and 20-shot scenarios of the SNIPS dataset.

Viaarxiv icon