Alert button
Picture for Emily Dinan

Emily Dinan

Alert button

Effective Theory of Transformers at Initialization

Apr 04, 2023
Emily Dinan, Sho Yaida, Susan Zhang

Figure 1 for Effective Theory of Transformers at Initialization
Figure 2 for Effective Theory of Transformers at Initialization
Figure 3 for Effective Theory of Transformers at Initialization
Figure 4 for Effective Theory of Transformers at Initialization

We perform an effective-theory analysis of forward-backward signal propagation in wide and deep Transformers, i.e., residual neural networks with multi-head self-attention blocks and multilayer perceptron blocks. This analysis suggests particular width scalings of initialization and training hyperparameters for these models. We then take up such suggestions, training Vision and Language Transformers in practical setups.

* 64 pages, 5 figures 
Viaarxiv icon

Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines

Dec 15, 2022
Andrew Lee, David Wu, Emily Dinan, Mike Lewis

Figure 1 for Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
Figure 2 for Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
Figure 3 for Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines
Figure 4 for Improving Chess Commentaries by Combining Language Models with Symbolic Reasoning Engines

Despite many recent advancements in language modeling, state-of-the-art language models lack grounding in the real world and struggle with tasks involving complex reasoning. Meanwhile, advances in the symbolic reasoning capabilities of AI have led to systems that outperform humans in games like chess and Go (Silver et al., 2018). Chess commentary provides an interesting domain for bridging these two fields of research, as it requires reasoning over a complex board state and providing analyses in natural language. In this work we demonstrate how to combine symbolic reasoning engines with controllable language models to generate chess commentaries. We conduct experiments to demonstrate that our approach generates commentaries that are preferred by human judges over previous baselines.

Viaarxiv icon

AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies

Nov 22, 2022
Weiyan Shi, Emily Dinan, Adi Renduchintala, Daniel Fried, Athul Paul Jacob, Zhou Yu, Mike Lewis

Figure 1 for AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Figure 2 for AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Figure 3 for AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies
Figure 4 for AutoReply: Detecting Nonsense in Dialogue Introspectively with Discriminative Replies

Existing approaches built separate classifiers to detect nonsense in dialogues. In this paper, we show that without external classifiers, dialogue models can detect errors in their own messages introspectively, by calculating the likelihood of replies that are indicative of poor messages. For example, if an agent believes its partner is likely to respond "I don't understand" to a candidate message, that message may not make sense, so an alternative message should be chosen. We evaluate our approach on a dataset from the game Diplomacy, which contains long dialogues richly grounded in the game state, on which existing models make many errors. We first show that hand-crafted replies can be effective for the task of detecting nonsense in applications as complex as Diplomacy. We then design AutoReply, an algorithm to search for such discriminative replies automatically, given a small number of annotated dialogue examples. We find that AutoReply-generated replies outperform handcrafted replies and perform on par with carefully fine-tuned large supervised models. Results also show that one single reply without much computation overheads can also detect dialogue nonsense reasonably well.

Viaarxiv icon

When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels

Oct 28, 2022
Weiyan Shi, Emily Dinan, Kurt Shuster, Jason Weston, Jing Xu

Figure 1 for When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels
Figure 2 for When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels
Figure 3 for When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels
Figure 4 for When Life Gives You Lemons, Make Cherryade: Converting Feedback from Bad Responses into Good Labels

Deployed dialogue agents have the potential to integrate human feedback to continuously improve themselves. However, humans may not always provide explicit signals when the chatbot makes mistakes during interactions. In this work, we propose Juicer, a framework to make use of both binary and free-form textual human feedback. It works by: (i) extending sparse binary feedback by training a satisfaction classifier to label the unlabeled data; and (ii) training a reply corrector to map the bad replies to good ones. We find that augmenting training with model-corrected replies improves the final dialogue model, and we can further improve performance by using both positive and negative replies through the recently proposed Director model.

Viaarxiv icon

Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

Jul 23, 2021
Emily Dinan, Gavin Abercrombie, A. Stevie Bergman, Shannon Spruit, Dirk Hovy, Y-Lan Boureau, Verena Rieser

Figure 1 for Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
Figure 2 for Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
Figure 3 for Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling
Figure 4 for Anticipating Safety Issues in E2E Conversational AI: Framework and Tooling

Over the last several years, end-to-end neural conversational agents have vastly improved in their ability to carry a chit-chat conversation with humans. However, these models are often trained on large datasets from the internet, and as a result, may learn undesirable behaviors from this data, such as toxic or otherwise harmful language. Researchers must thus wrestle with the issue of how and when to release these models. In this paper, we survey the problem landscape for safety for end-to-end conversational AI and discuss recent and related work. We highlight tensions between values, potential positive impact and potential harms, and provide a framework for making decisions about whether and how to release these models, following the tenets of value-sensitive design. We additionally provide a suite of tools to enable researchers to make better-informed decisions about training and releasing end-to-end conversational AI models.

Viaarxiv icon

Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness

Dec 30, 2020
Sabrina J. Mielke, Arthur Szlam, Y-Lan Boureau, Emily Dinan

Figure 1 for Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness
Figure 2 for Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness
Figure 3 for Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness
Figure 4 for Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness

Open-domain dialogue agents have vastly improved, but still confidently hallucinate knowledge or express doubt when asked straightforward questions. In this work, we analyze whether state-of-the-art chit-chat models can express metacognition capabilities through their responses: does a verbalized expression of doubt (or confidence) match the likelihood that the model's answer is incorrect (or correct)? We find that these models are poorly calibrated in this sense, yet we show that the representations within the models can be used to accurately predict likelihood of correctness. By incorporating these correctness predictions into the training of a controllable generation model, we obtain a dialogue agent with greatly improved linguistic calibration.

Viaarxiv icon

Recipes for Safety in Open-domain Chatbots

Oct 22, 2020
Jing Xu, Da Ju, Margaret Li, Y-Lan Boureau, Jason Weston, Emily Dinan

Figure 1 for Recipes for Safety in Open-domain Chatbots
Figure 2 for Recipes for Safety in Open-domain Chatbots
Figure 3 for Recipes for Safety in Open-domain Chatbots
Figure 4 for Recipes for Safety in Open-domain Chatbots

Models trained on large unlabeled corpora of human interactions will learn patterns and mimic behaviors therein, which include offensive or otherwise toxic behavior and unwanted biases. We investigate a variety of methods to mitigate these issues in the context of open-domain generative dialogue models. We introduce a new human-and-model-in-the-loop framework for both training safer models and for evaluating them, as well as a novel method to distill safety considerations inside generative models without the use of an external classifier at deployment time. We conduct experiments comparing these methods and find our new techniques are (i) safer than existing models as measured by automatic and human evaluations while (ii) maintaining usability metrics such as engagingness relative to the state of the art. We then discuss the limitations of this work by analyzing failure cases of our models.

Viaarxiv icon