Alert button
Picture for Takyoung Kim

Takyoung Kim

Alert button

KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

May 30, 2023
Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Gunhee Kim, Jung-Woo Ha

Figure 1 for KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application
Figure 2 for KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application
Figure 3 for KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application
Figure 4 for KoSBi: A Dataset for Mitigating Social Bias Risks Towards Safer Large Language Model Application

Large language models (LLMs) learn not only natural text generation abilities but also social biases against different demographic groups from real-world data. This poses a critical risk when deploying LLM-based applications. Existing research and resources are not readily applicable in South Korea due to the differences in language and culture, both of which significantly affect the biases and targeted demographic groups. This limitation requires localized social bias datasets to ensure the safe and effective deployment of LLMs. To this end, we present KO SB I, a new social bias dataset of 34k pairs of contexts and sentences in Korean covering 72 demographic groups in 15 categories. We find that through filtering-based moderation, social biases in generated content can be reduced by 16.47%p on average for HyperCLOVA (30B and 82B), and GPT-3.

* 17 pages, 8 figures, 12 tables, ACL 2023 
Viaarxiv icon

SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration

May 28, 2023
Hwaran Lee, Seokhee Hong, Joonsuk Park, Takyoung Kim, Meeyoung Cha, Yejin Choi, Byoung Pil Kim, Gunhee Kim, Eun-Ju Lee, Yong Lim, Alice Oh, Sangchul Park, Jung-Woo Ha

Figure 1 for SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration
Figure 2 for SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration
Figure 3 for SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration
Figure 4 for SQuARe: A Large-Scale Dataset of Sensitive Questions and Acceptable Responses Created Through Human-Machine Collaboration

The potential social harms that large language models pose, such as generating offensive content and reinforcing biases, are steeply rising. Existing works focus on coping with this concern while interacting with ill-intentioned users, such as those who explicitly make hate speech or elicit harmful responses. However, discussions on sensitive issues can become toxic even if the users are well-intentioned. For safer models in such scenarios, we present the Sensitive Questions and Acceptable Response (SQuARe) dataset, a large-scale Korean dataset of 49k sensitive questions with 42k acceptable and 46k non-acceptable responses. The dataset was constructed leveraging HyperCLOVA in a human-in-the-loop manner based on real news headlines. Experiments show that acceptable response generation significantly improves for HyperCLOVA and GPT-3, demonstrating the efficacy of this dataset.

* 19 pages, 10 figures, ACL 2023 
Viaarxiv icon

Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

May 23, 2023
Takyoung Kim, Jamin Shin, Young-Ho Kim, Sanghwan Bae, Sungdong Kim

Figure 1 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation
Figure 2 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation
Figure 3 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation
Figure 4 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

Most task-oriented dialogue (TOD) benchmarks assume users that know exactly how to use the system by constraining the user behaviors within the system's capabilities via strict user goals, namely "user familiarity" bias. This data bias deepens when it combines with data-driven TOD systems, as it is impossible to fathom the effect of it with existing static evaluations. Hence, we conduct an interactive user study to unveil how vulnerable TOD systems are against realistic scenarios. In particular, we compare users with 1) detailed goal instructions that conform to the system boundaries (closed-goal) and 2) vague goal instructions that are often unsupported but realistic (open-goal). Our study reveals that conversations in open-goal settings lead to catastrophic failures of the system, in which 92% of the dialogues had significant issues. Moreover, we conduct a thorough analysis to identify distinctive features between the two settings through error annotation. From this, we discover a novel "pretending" behavior, in which the system pretends to handle the user requests even though they are beyond the system's capabilities. We discuss its characteristics and toxicity while emphasizing transparency and a fallback strategy for robust TOD systems.

Viaarxiv icon

DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Jul 08, 2022
Yukyung Lee, Takyoung Kim, Hoonsang Yoon, Pilsung Kang, Junseong Bang, Misuk Kim

Figure 1 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training
Figure 2 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training
Figure 3 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training
Figure 4 for DSTEA: Dialogue State Tracking with Entity Adaptive Pre-training

Dialogue state tracking (DST) is a core sub-module of a dialogue system, which aims to extract the appropriate belief state (domain-slot-value) from a system and user utterances. Most previous studies have attempted to improve performance by increasing the size of the pre-trained model or using additional features such as graph relations. In this study, we propose dialogue state tracking with entity adaptive pre-training (DSTEA), a system in which key entities in a sentence are more intensively trained by the encoder of the DST model. DSTEA extracts important entities from input dialogues in four ways, and then applies selective knowledge masking to train the model effectively. Although DSTEA conducts only pre-training without directly infusing additional knowledge to the DST model, it achieved better performance than the best-known benchmark models on MultiWOZ 2.0, 2.1, and 2.2. The effectiveness of DSTEA was verified through various comparative experiments with regard to the entity type and different adaptive settings.

Viaarxiv icon

Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

Mar 31, 2022
Takyoung Kim, Hoonsang Yoon, Yukyung Lee, Pilsung Kang, Misuk Kim

Figure 1 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking
Figure 2 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking
Figure 3 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking
Figure 4 for Mismatch between Multi-turn Dialogue and its Evaluation Metric in Dialogue State Tracking

Dialogue state tracking (DST) aims to extract essential information from multi-turn dialogue situations and take appropriate actions. A belief state, one of the core pieces of information, refers to the subject and its specific content, and appears in the form of domain-slot-value. The trained model predicts "accumulated" belief states in every turn, and joint goal accuracy and slot accuracy are mainly used to evaluate the prediction; however, we specify that the current evaluation metrics have a critical limitation when evaluating belief states accumulated as the dialogue proceeds, especially in the most used MultiWOZ dataset. Additionally, we propose relative slot accuracy to complement existing metrics. Relative slot accuracy does not depend on the number of predefined slots, and allows intuitive evaluation by assigning relative scores according to the turn of each dialogue. This study also encourages not solely the reporting of joint goal accuracy, but also various complementary metrics in DST tasks for the sake of a realistic evaluation.

* ACL 2022 (short) 
Viaarxiv icon

Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

Aug 28, 2021
Takyoung Kim, Yukyung Lee, Hoonsang Yoon, Pilsung Kang, Misuk Kim

Figure 1 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances
Figure 2 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances
Figure 3 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances
Figure 4 for Oh My Mistake!: Toward Realistic Dialogue State Tracking including Turnback Utterances

The primary purpose of dialogue state tracking (DST), a critical component of an end-to-end conversational system, is to build a model that responds well to real-world situations. Although we often change our minds during ordinary conversations, current benchmark datasets do not adequately reflect such occurrences and instead consist of over-simplified conversations, in which no one changes their mind during a conversation. As the main question inspiring the present study,``Are current benchmark datasets sufficiently diverse to handle casual conversations in which one changes their mind?'' We found that the answer is ``No'' because simply injecting template-based turnback utterances significantly degrades the DST model performance. The test joint goal accuracy on the MultiWOZ decreased by over 5\%p when the simplest form of turnback utterance was injected. Moreover, the performance degeneration worsens when facing more complicated turnback situations. However, we also observed that the performance rebounds when a turnback is appropriately included in the training dataset, implying that the problem is not with the DST models but rather with the construction of the benchmark dataset.

* 13 pages, 5 figures 
Viaarxiv icon