Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Sep 27, 2021

Adaku Uchendu, Zeyu Ma, Thai Le, Rui Zhang, Dongwon Lee

Figure 1 for TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Figure 2 for TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Figure 3 for TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Figure 4 for TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Share this with someone who'll enjoy it:

Abstract:Recent progress in generative language models has enabled machines to generate astonishingly realistic texts. While there are many legitimate applications of such models, there is also a rising need to distinguish machine-generated texts from human-written ones (e.g., fake news detection). However, to our best knowledge, there is currently no benchmark environment with datasets and tasks to systematically study the so-called "Turing Test" problem for neural text generation methods. In this work, we present the TuringBench benchmark environment, which is comprised of (1) a dataset with 200K human- or machine-generated samples across 20 labels {Human, GPT-1, GPT-2_small, GPT-2_medium, GPT-2_large, GPT-2_xl, GPT-2_PyTorch, GPT-3, GROVER_base, GROVER_large, GROVER_mega, CTRL, XLM, XLNET_base, XLNET_large, FAIR_wmt19, FAIR_wmt20, TRANSFORMER_XL, PPLM_distil, PPLM_gpt2}, (2) two benchmark tasks -- i.e., Turing Test (TT) and Authorship Attribution (AA), and (3) a website with leaderboards. Our preliminary experimental results using TuringBench show that FAIR_wmt20 and GPT-3 are the current winners, among all language models tested, in generating the most human-like indistinguishable texts with the lowest F1 score by five state-of-the-art TT detection models. The TuringBench is available at: https://turingbench.ist.psu.edu/

* Accepted to Findings of EMNLP 2021

View paper on

Share this with someone who'll enjoy it:

Title:TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation

Paper and Code