Alert button
Picture for Di Jin

Di Jin

Alert button

Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language

Nov 24, 2023
Di Jin, Shikib Mehri, Devamanyu Hazarika, Aishwarya Padmakumar, Sungjin Lee, Yang Liu, Mahdi Namazifar

Learning from human feedback is a prominent technique to align the output of large language models (LLMs) with human expectations. Reinforcement learning from human feedback (RLHF) leverages human preference signals that are in the form of ranking of response pairs to perform this alignment. However, human preference on LLM outputs can come in much richer forms including natural language, which may provide detailed feedback on strengths and weaknesses of a given response. In this work we investigate data efficiency of modeling human feedback that is in natural language. Specifically, we fine-tune an open-source LLM, e.g., Falcon-40B-Instruct, on a relatively small amount (1000 records or even less) of human feedback in natural language in the form of critiques and revisions of responses. We show that this model is able to improve the quality of responses from even some of the strongest LLMs such as ChatGPT, BARD, and Vicuna, through critique and revision of those responses. For instance, through one iteration of revision of ChatGPT responses, the revised responses have 56.6% win rate over the original ones, and this win rate can be further improved to 65.9% after applying the revision for five iterations.

* Accepted by Workshop on Instruction Tuning and Instruction Following at NeurIPS 2023, Submitted to AAAI 2024 
Viaarxiv icon

A Survey on Fairness-aware Recommender Systems

Jun 01, 2023
Di Jin, Luzhi Wang, He Zhang, Yizhen Zheng, Weiping Ding, Feng Xia, Shirui Pan

Figure 1 for A Survey on Fairness-aware Recommender Systems
Figure 2 for A Survey on Fairness-aware Recommender Systems
Figure 3 for A Survey on Fairness-aware Recommender Systems
Figure 4 for A Survey on Fairness-aware Recommender Systems

As information filtering services, recommender systems have extremely enriched our daily life by providing personalized suggestions and facilitating people in decision-making, which makes them vital and indispensable to human society in the information era. However, as people become more dependent on them, recent studies show that recommender systems potentially own unintentional impacts on society and individuals because of their unfairness (e.g., gender discrimination in job recommendations). To develop trustworthy services, it is crucial to devise fairness-aware recommender systems that can mitigate these bias issues. In this survey, we summarise existing methodologies and practices of fairness in recommender systems. Firstly, we present concepts of fairness in different recommendation scenarios, comprehensively categorize current advances, and introduce typical methods to promote fairness in different stages of recommender systems. Next, after introducing datasets and evaluation metrics applied to assess the fairness of recommender systems, we will delve into the significant influence that fairness-aware recommender systems exert on real-world industrial applications. Subsequently, we highlight the connection between fairness and other principles of trustworthy recommender systems, aiming to consider trustworthiness principles holistically while advocating for fairness. Finally, we summarize this review, spotlighting promising opportunities in comprehending concepts, frameworks, the balance between accuracy and fairness, and the ties with trustworthiness, with the ultimate goal of fostering the development of fairness-aware recommender systems.

* 27 pages, 9 figures 
Viaarxiv icon

The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning

May 24, 2023
Jingyuan Qi, Zhiyang Xu, Ying Shen, Minqian Liu, Di Jin, Qifan Wang, Lifu Huang

Figure 1 for The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning
Figure 2 for The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning
Figure 3 for The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning
Figure 4 for The Art of SOCRATIC QUESTIONING: Zero-shot Multimodal Reasoning with Recursive Thinking and Self-Questioning

Chain-of-Thought prompting (CoT) enables large-scale language models to solve complex reasoning problems by decomposing the problem and tackling it step-by-step. However, Chain-of-Thought is a greedy thinking process that requires the language model to come up with a starting point and generate the next step solely based on previous steps. This thinking process is different from how humans approach a complex problem e.g., we proactively raise sub-problems related to the original problem and recursively answer them. In this work, we propose Socratic Questioning, a divide-and-conquer fashion algorithm that simulates the self-questioning and recursive thinking process. Socratic Questioning is driven by a Self-Questioning module that employs a large-scale language model to propose sub-problems related to the original problem as intermediate steps and Socratic Questioning recursively backtracks and answers the sub-problems until reaches the original problem. We apply our proposed algorithm to the visual question-answering task as a case study and by evaluating it on three public benchmark datasets, we observe a significant performance improvement over all baselines on (almost) all datasets. In addition, the qualitative analysis clearly demonstrates the intermediate thinking steps elicited by Socratic Questioning are similar to the human's recursively thinking process of a complex reasoning problem.

* 15 pages, 12 figure, 2 algorithms 
Viaarxiv icon

"What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge

May 20, 2023
Chao Zhao, Spandana Gella, Seokhwan Kim, Di Jin, Devamanyu Hazarika, Alexandros Papangelis, Behnam Hedayatnia, Mahdi Namazifar, Yang Liu, Dilek Hakkani-Tur

Figure 1 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Figure 2 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Figure 3 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge
Figure 4 for "What do others think?": Task-Oriented Conversational Modeling with Subjective Knowledge

Task-oriented Dialogue (TOD) Systems aim to build dialogue systems that assist users in accomplishing specific goals, such as booking a hotel or a restaurant. Traditional TODs rely on domain-specific APIs/DBs or external factual knowledge to generate responses, which cannot accommodate subjective user requests (e.g., "Is the WIFI reliable?" or "Does the restaurant have a good atmosphere?"). To address this issue, we propose a novel task of subjective-knowledge-based TOD (SK-TOD). We also propose the first corresponding dataset, which contains subjective knowledge-seeking dialogue contexts and manually annotated responses grounded in subjective knowledge sources. When evaluated with existing TOD approaches, we find that this task poses new challenges such as aggregating diverse opinions from multiple knowledge snippets. We hope this task and dataset can promote further research on TOD and subjective content understanding. The code and the dataset are available at https://github.com/alexa/dstc11-track5.

Viaarxiv icon

Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation

May 10, 2023
Di Jin, Luzhi Wang, Yizhen Zheng, Guojie Song, Fei Jiang, Xiang Li, Wei Lin, Shirui Pan

Figure 1 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation
Figure 2 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation
Figure 3 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation
Figure 4 for Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation

Recommender systems are essential to various fields, e.g., e-commerce, e-learning, and streaming media. At present, graph neural networks (GNNs) for session-based recommendations normally can only recommend items existing in users' historical sessions. As a result, these GNNs have difficulty recommending items that users have never interacted with (new items), which leads to a phenomenon of information cocoon. Therefore, it is necessary to recommend new items to users. As there is no interaction between new items and users, we cannot include new items when building session graphs for GNN session-based recommender systems. Thus, it is challenging to recommend new items for users when using GNN-based methods. We regard this challenge as '\textbf{G}NN \textbf{S}ession-based \textbf{N}ew \textbf{I}tem \textbf{R}ecommendation (GSNIR)'. To solve this problem, we propose a dual-intent enhanced graph neural network for it. Due to the fact that new items are not tied to historical sessions, the users' intent is difficult to predict. We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item. To solve the challenge that new items cannot be learned by GNNs, inspired by zero-shot learning (ZSL), we infer the new item representation in GNN space by using their attributes. By outputting new item probabilities, which contain recommendation scores of the corresponding items, the new items with higher scores are recommended to users. Experiments on two representative real-world datasets show the superiority of our proposed method. The case study from the real-world verifies interpretability benefits brought by the dual-intent module and the new item reasoning module. The code is available at Github: https://github.com/Ee1s/NirGNN

* 10 Pages, 6 figures, WWW'2023 
Viaarxiv icon

GNNs,You can be Stronger,Deeper and Faster

May 09, 2023
Jingbo Zhou, Yixuan Du, Ruqiong Zhang, Rui Zhang, Di Jin, Carl Yang

Figure 1 for GNNs,You can be Stronger,Deeper and Faster
Figure 2 for GNNs,You can be Stronger,Deeper and Faster
Figure 3 for GNNs,You can be Stronger,Deeper and Faster
Figure 4 for GNNs,You can be Stronger,Deeper and Faster

Graph neural networks (GNNs), a type of neural network that can learn from graph-structured data and learn the representation of nodes by aggregating their neighbors, have shown excellent performance in downstream tasks.However, it is known that the performance of graph neural networks (GNNs) degrades gradually as the number of layers increases. Based on k-hop subgraph aggregation, which is a new concept, we propose a new perspective to understand the expressive power of GNN.From this perspective, we reveal the potential causes of the performance degradation of the deep traditional GNN - aggregated subgraph overlap, and the fact that the residual-based graph neural networks in fact exploit the aggregation results of 1 to k hop subgraphs to improve the effectiveness.Further, we propose a new sampling-based node-level residual module named SDF, which is shown by theoretical derivation to obtain a superior expressive power compared to previous residual methods by using information from 1 to k hop subgraphs more flexibly. Extensive experiments show that the performance and efficiency of GNN with the SDF module outperform other methods.

Viaarxiv icon

KGTrust: Evaluating Trustworthiness of SIoT via Knowledge Enhanced Graph Neural Networks

Feb 22, 2023
Zhizhi Yu, Di Jin, Cuiying Huo, Zhiqiang Wang, Xiulong Liu, Heng Qi, Jia Wu, Lingfei Wu

Figure 1 for KGTrust: Evaluating Trustworthiness of SIoT via Knowledge Enhanced Graph Neural Networks
Figure 2 for KGTrust: Evaluating Trustworthiness of SIoT via Knowledge Enhanced Graph Neural Networks
Figure 3 for KGTrust: Evaluating Trustworthiness of SIoT via Knowledge Enhanced Graph Neural Networks
Figure 4 for KGTrust: Evaluating Trustworthiness of SIoT via Knowledge Enhanced Graph Neural Networks

Social Internet of Things (SIoT), a promising and emerging paradigm that injects the notion of social networking into smart objects (i.e., things), paving the way for the next generation of Internet of Things. However, due to the risks and uncertainty, a crucial and urgent problem to be settled is establishing reliable relationships within SIoT, that is, trust evaluation. Graph neural networks for trust evaluation typically adopt a straightforward way such as one-hot or node2vec to comprehend node characteristics, which ignores the valuable semantic knowledge attached to nodes. Moreover, the underlying structure of SIoT is usually complex, including both the heterogeneous graph structure and pairwise trust relationships, which renders hard to preserve the properties of SIoT trust during information propagation. To address these aforementioned problems, we propose a novel knowledge-enhanced graph neural network (KGTrust) for better trust evaluation in SIoT. Specifically, we first extract useful knowledge from users' comment behaviors and external structured triples related to object descriptions, in order to gain a deeper insight into the semantics of users and objects. Furthermore, we introduce a discriminative convolutional layer that utilizes heterogeneous graph structure, node semantics, and augmented trust relationships to learn node embeddings from the perspective of a user as a trustor or a trustee, effectively capturing multi-aspect properties of SIoT trust during information propagation. Finally, a trust prediction layer is developed to estimate the trust relationships between pairwise nodes. Extensive experiments on three public datasets illustrate the superior performance of KGTrust over state-of-the-art methods.

* Accepted by WWW-23 
Viaarxiv icon

Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information

Feb 10, 2023
Yen-Ting Lin, Alexandros Papangelis, Seokhwan Kim, Sungjin Lee, Devamanyu Hazarika, Mahdi Namazifar, Di Jin, Yang Liu, Dilek Hakkani-Tur

Figure 1 for Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information
Figure 2 for Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information
Figure 3 for Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information
Figure 4 for Selective In-Context Data Augmentation for Intent Detection using Pointwise V-Information

This work focuses on in-context data augmentation for intent detection. Having found that augmentation via in-context prompting of large pre-trained language models (PLMs) alone does not improve performance, we introduce a novel approach based on PLMs and pointwise V-information (PVI), a metric that can measure the usefulness of a datapoint for training a model. Our method first fine-tunes a PLM on a small seed of training data and then synthesizes new datapoints - utterances that correspond to given intents. It then employs intent-aware filtering, based on PVI, to remove datapoints that are not helpful to the downstream intent classifier. Our method is thus able to leverage the expressive power of large language models to produce diverse training data. Empirical results demonstrate that our method can produce synthetic training data that achieve state-of-the-art performance on three challenging intent detection datasets under few-shot settings (1.28% absolute improvement in 5-shot and 1.18% absolute in 10-shot, on average) and perform on par with the state-of-the-art in full-shot settings (within 0.01% absolute, on average).

* Accepted at EACL 2023 
Viaarxiv icon

Using In-Context Learning to Improve Dialogue Safety

Feb 02, 2023
Nicholas Meade, Spandana Gella, Devamanyu Hazarika, Prakhar Gupta, Di Jin, Siva Reddy, Yang Liu, Dilek Hakkani-Tür

Figure 1 for Using In-Context Learning to Improve Dialogue Safety
Figure 2 for Using In-Context Learning to Improve Dialogue Safety
Figure 3 for Using In-Context Learning to Improve Dialogue Safety
Figure 4 for Using In-Context Learning to Improve Dialogue Safety

While large neural-based conversational models have become increasingly proficient as dialogue agents, recent work has highlighted safety issues with these systems. For example, these systems can be goaded into generating toxic content, which often perpetuates social biases or stereotypes. We investigate a retrieval-based framework for reducing bias and toxicity in responses generated from neural-based chatbots. It uses in-context learning to steer a model towards safer generations. Concretely, to generate a response to an unsafe dialogue context, we retrieve demonstrations of safe model responses to similar dialogue contexts. We find our proposed approach performs competitively with strong baselines which use fine-tuning. For instance, using automatic evaluation, we find our best fine-tuned baseline only generates safe responses to unsafe dialogue contexts from DiaSafety 2.92% more than our approach. Finally, we also propose a straightforward re-ranking procedure which can further improve response safeness.

Viaarxiv icon

Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

Jan 26, 2023
Mingyu Derek Ma, Jiun-Yu Kao, Shuyang Gao, Arpit Gupta, Di Jin, Tagyoung Chung, Nanyun Peng

Figure 1 for Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning
Figure 2 for Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning
Figure 3 for Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning
Figure 4 for Parameter-Efficient Low-Resource Dialogue State Tracking by Prompt Tuning

Dialogue state tracking (DST) is an important step in dialogue management to keep track of users' beliefs. Existing works fine-tune all language model (LM) parameters to tackle the DST task, which requires significant data and computing resources for training and hosting. The cost grows exponentially in the real-world deployment where dozens of fine-tuned LM are used for different domains and tasks. To reduce parameter size and better utilize cross-task shared information, we propose to use soft prompt token embeddings to learn task properties. Without tuning LM parameters, our method drastically reduces the number of parameters needed to less than 0.5% of prior works while achieves better low-resource DST performance.

* 5 pages, in the Second Workshop on Efficient Natural Language and Speech Processing (ENLSP) at NeurIPS 2022 
Viaarxiv icon