Alert button

"chatbots": models, code, and papers
Alert button

Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

Jan 14, 2023
Jing Wei, Sungdong Kim, Hyunhoon Jung, Young-Ho Kim

Figure 1 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Figure 2 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Figure 3 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Figure 4 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

Large language models (LLMs) provide a new way to build chatbots by accepting natural language prompts. Yet, it is unclear how to design prompts to power chatbots to carry on naturalistic conversations while pursuing a given goal, such as collecting self-report data from users. We explore what design factors of prompts can help steer chatbots to talk naturally and collect data reliably. To this aim, we formulated four prompt designs with different structures and personas. Through an online study (N = 48) where participants conversed with chatbots driven by different designs of prompts, we assessed how prompt designs and conversation topics affected the conversation flows and users' perceptions of chatbots. Our chatbots covered 79% of the desired information slots during conversations, and the designs of prompts and topics significantly influenced the conversation flows and the data collection performance. We discuss the opportunities and challenges of building chatbots with LLMs.

* 22 pages including Appendix, 7 figures, 7 tables 

On Safe and Usable Chatbots for Promoting Voter Participation

Dec 28, 2022
Bharath Muppasani, Vishal Pallagani, Kausik Lakkaraju, Shuge Lei, Biplav Srivastava, Brett Robertson, Andrea Hickerson, Vignesh Narayanan

Figure 1 for On Safe and Usable Chatbots for Promoting Voter Participation
Figure 2 for On Safe and Usable Chatbots for Promoting Voter Participation
Figure 3 for On Safe and Usable Chatbots for Promoting Voter Participation
Figure 4 for On Safe and Usable Chatbots for Promoting Voter Participation

Chatbots, or bots for short, are multi-modal collaborative assistants that can help people complete useful tasks. Usually, when chatbots are referenced in connection with elections, they often draw negative reactions due to the fear of mis-information and hacking. Instead, in this paper, we explore how chatbots may be used to promote voter participation in vulnerable segments of society like senior citizens and first-time voters. In particular, we build a system that amplifies official information while personalizing it to users' unique needs transparently. We discuss its design, build prototypes with frequently asked questions (FAQ) election information for two US states that are low on an ease-of-voting scale, and report on its initial evaluation in a focus group. Our approach can be a win-win for voters, election agencies trying to fulfill their mandate and democracy at large.

* 7 pages, In AAAI 2023 Workshop on AI for Credible Elections 

Using In-Context Learning to Improve Dialogue Safety

Feb 02, 2023
Nicholas Meade, Spandana Gella, Devamanyu Hazarika, Prakhar Gupta, Di Jin, Siva Reddy, Yang Liu, Dilek Hakkani-Tür

Figure 1 for Using In-Context Learning to Improve Dialogue Safety
Figure 2 for Using In-Context Learning to Improve Dialogue Safety
Figure 3 for Using In-Context Learning to Improve Dialogue Safety
Figure 4 for Using In-Context Learning to Improve Dialogue Safety

While large neural-based conversational models have become increasingly proficient as dialogue agents, recent work has highlighted safety issues with these systems. For example, these systems can be goaded into generating toxic content, which often perpetuates social biases or stereotypes. We investigate a retrieval-based framework for reducing bias and toxicity in responses generated from neural-based chatbots. It uses in-context learning to steer a model towards safer generations. Concretely, to generate a response to an unsafe dialogue context, we retrieve demonstrations of safe model responses to similar dialogue contexts. We find our proposed approach performs competitively with strong baselines which use fine-tuning. For instance, using automatic evaluation, we find our best fine-tuned baseline only generates safe responses to unsafe dialogue contexts from DiaSafety 2.92% more than our approach. Finally, we also propose a straightforward re-ranking procedure which can further improve response safeness.

Inform the uninformed: Improving Online Informed Consent Reading with an AI-Powered Chatbot

Feb 02, 2023
Ziang Xiao, Tiffany Wenting Li, Karrie Karahalios, Hari Sundaram

Informed consent is a core cornerstone of ethics in human subject research. Through the informed consent process, participants learn about the study procedure, benefits, risks, and more to make an informed decision. However, recent studies showed that current practices might lead to uninformed decisions and expose participants to unknown risks, especially in online studies. Without the researcher's presence and guidance, online participants must read a lengthy form on their own with no answers to their questions. In this paper, we examined the role of an AI-powered chatbot in improving informed consent online. By comparing the chatbot with form-based interaction, we found the chatbot improved consent form reading, promoted participants' feelings of agency, and closed the power gap between the participant and the researcher. Our exploratory analysis further revealed the altered power dynamic might eventually benefit study response quality. We discussed design implications for creating AI-powered chatbots to offer effective informed consent in broader settings.

* Accepted by CHI 2023 

Chatbots in a Botnet World

Dec 22, 2022
Forrest McKee, David Noever

Figure 1 for Chatbots in a Botnet World
Figure 2 for Chatbots in a Botnet World
Figure 3 for Chatbots in a Botnet World
Figure 4 for Chatbots in a Botnet World

Question-and-answer formats provide a novel experimental platform for investigating cybersecurity questions. Unlike previous chatbots, the latest ChatGPT model from OpenAI supports an advanced understanding of complex coding questions. The research demonstrates thirteen coding tasks that generally qualify as stages in the MITRE ATT&CK framework, ranging from credential access to defense evasion. With varying success, the experimental prompts generate examples of keyloggers, logic bombs, obfuscated worms, and payment-fulfilled ransomware. The empirical results illustrate cases that support the broad gain of functionality, including self-replication and self-modification, evasion, and strategic understanding of complex cybersecurity goals. One surprising feature of ChatGPT as a language-only model centers on its ability to spawn coding approaches that yield images that obfuscate or embed executable programming steps or links.

Audience-Centric Natural Language Generation via Style Infusion

Jan 24, 2023
Samraj Moorjani, Adit Krishnan, Hari Sundaram, Ewa Maslowska, Aravind Sankar

Figure 1 for Audience-Centric Natural Language Generation via Style Infusion
Figure 2 for Audience-Centric Natural Language Generation via Style Infusion
Figure 3 for Audience-Centric Natural Language Generation via Style Infusion
Figure 4 for Audience-Centric Natural Language Generation via Style Infusion

Adopting contextually appropriate, audience-tailored linguistic styles is critical to the success of user-centric language generation systems (e.g., chatbots, computer-aided writing, dialog systems). While existing approaches demonstrate textual style transfer with large volumes of parallel or non-parallel data, we argue that grounding style on audience-independent external factors is innately limiting for two reasons. First, it is difficult to collect large volumes of audience-specific stylistic data. Second, some stylistic objectives (e.g., persuasiveness, memorability, empathy) are hard to define without audience feedback. In this paper, we propose the novel task of style infusion - infusing the stylistic preferences of audiences in pretrained language generation models. Since humans are better at pairwise comparisons than direct scoring - i.e., is Sample-A more persuasive/polite/empathic than Sample-B - we leverage limited pairwise human judgments to bootstrap a style analysis model and augment our seed set of judgments. We then infuse the learned textual style in a GPT-2 based text generator while balancing fluency and style adoption. With quantitative and qualitative assessments, we show that our infusion approach can generate compelling stylized examples with generic text prompts. The code and data are accessible at https://github.com/CrowdDynamicsLab/StyleInfusion.

* 14 pages, 3 figures, Accepted in Findings of EMNLP 2022 

Chatbots in a Honeypot World

Jan 10, 2023
Forrest McKee, David Noever

Figure 1 for Chatbots in a Honeypot World

Question-and-answer agents like ChatGPT offer a novel tool for use as a potential honeypot interface in cyber security. By imitating Linux, Mac, and Windows terminal commands and providing an interface for TeamViewer, nmap, and ping, it is possible to create a dynamic environment that can adapt to the actions of attackers and provide insight into their tactics, techniques, and procedures (TTPs). The paper illustrates ten diverse tasks that a conversational agent or large language model might answer appropriately to the effects of command-line attacker. The original result features feasibility studies for ten model tasks meant for defensive teams to mimic expected honeypot interfaces with minimal risks. Ultimately, the usefulness outside of forensic activities stems from whether the dynamic honeypot can extend the time-to-conquer or otherwise delay attacker timelines short of reaching key network assets like databases or confidential information. While ongoing maintenance and monitoring may be required, ChatGPT's ability to detect and deflect malicious activity makes it a valuable option for organizations seeking to enhance their cyber security posture. Future work will focus on cybersecurity layers, including perimeter security, host virus detection, and data security.

How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection

Jan 18, 2023
Biyang Guo, Xin Zhang, Ziyuan Wang, Minqi Jiang, Jinran Nie, Yuxuan Ding, Jianwei Yue, Yupeng Wu

Figure 1 for How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
Figure 2 for How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
Figure 3 for How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
Figure 4 for How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection

The introduction of ChatGPT has garnered widespread attention in both academic and industrial communities. ChatGPT is able to respond effectively to a wide range of human questions, providing fluent and comprehensive answers that significantly surpass previous public chatbots in terms of security and usefulness. On one hand, people are curious about how ChatGPT is able to achieve such strength and how far it is from human experts. On the other hand, people are starting to worry about the potential negative impacts that large language models (LLMs) like ChatGPT could have on society, such as fake news, plagiarism, and social security issues. In this work, we collected tens of thousands of comparison responses from both human experts and ChatGPT, with questions ranging from open-domain, financial, medical, legal, and psychological areas. We call the collected dataset the Human ChatGPT Comparison Corpus (HC3). Based on the HC3 dataset, we study the characteristics of ChatGPT's responses, the differences and gaps from human experts, and future directions for LLMs. We conducted comprehensive human evaluations and linguistic analyses of ChatGPT-generated content compared with that of humans, where many interesting results are revealed. After that, we conduct extensive experiments on how to effectively detect whether a certain text is generated by ChatGPT or humans. We build three different detection systems, explore several key factors that influence their effectiveness, and evaluate them in different scenarios. The dataset, code, and models are all publicly available at https://github.com/Hello-SimpleAI/chatgpt-comparison-detection.

* https://github.com/Hello-SimpleAI/chatgpt-comparison-detection 

TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

Jan 14, 2023
Hongpeng Lin, Ludan Ruan, Wenke Xia, Peiyu Liu, Jingyuan Wen, Yixin Xu, Di Hu, Ruihua Song, Wayne Xin Zhao, Qin Jin, Zhiwu Lu

Figure 1 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 2 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 3 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat
Figure 4 for TikTalk: A Multi-Modal Dialogue Dataset for Real-World Chitchat

We present a novel multi-modal chitchat dialogue dataset-TikTalk aimed at facilitating the research of intelligent chatbots. It consists of the videos and corresponding dialogues users generate on video social applications. In contrast to existing multi-modal dialogue datasets, we construct dialogue corpora based on video comment-reply pairs, which is more similar to chitchat in real-world dialogue scenarios. Our dialogue context includes three modalities: text, vision, and audio. Compared with previous image-based dialogue datasets, the richer sources of context in TikTalk lead to a greater diversity of conversations. TikTalk contains over 38K videos and 367K dialogues. Data analysis shows that responses in TikTalk are in correlation with various contexts and external knowledge. It poses a great challenge for the deep understanding of multi-modal information and the generation of responses. We evaluate several baselines on three types of automatic metrics and conduct case studies. Experimental results demonstrate that there is still a large room for future improvement on TikTalk. Our dataset is available at \url{https://github.com/RUC-AIMind/TikTalk}.