Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

Dec 07, 2020

José Lopes, Francisco J. Chiyah Garcia, Helen Hastie

Figure 1 for The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

Figure 2 for The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

Figure 3 for The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

Figure 4 for The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

Share this with someone who'll enjoy it:

Abstract:Challenges around collecting and processing quality data have hampered progress in data-driven dialogue models. Previous approaches are moving away from costly, resource-intensive lab settings, where collection is slow but where the data is deemed of high quality. The advent of crowd-sourcing platforms, such as Amazon Mechanical Turk, has provided researchers with an alternative cost-effective and rapid way to collect data. However, the collection of fluid, natural spoken or textual interaction can be challenging, particularly between two crowd-sourced workers. In this study, we compare the performance of dialogue models for the same interaction task but collected in two different settings: in the lab vs. crowd-sourced. We find that fewer lab dialogues are needed to reach similar accuracy, less than half the amount of lab data as crowd-sourced data. We discuss the advantages and disadvantages of each data collection method.

* Accepted at Human in the Loop Dialogue Systems Workshop @NeurIPS 2020

View paper on

Share this with someone who'll enjoy it:

Title:The Lab vs The Crowd: An Investigation into Data Quality for Neural Dialogue Models

Paper and Code