Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yada Pruksachatkun

jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models

Mar 04, 2020

Yada Pruksachatkun, Phil Yeres, Haokun Liu, Jason Phang, Phu Mon Htut, Alex Wang, Ian Tenney, Samuel R. Bowman

Figure 1 for jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models

Figure 2 for jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models

Figure 3 for jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models

Abstract:We introduce jiant, an open source toolkit for conducting multitask and transfer learning experiments on English NLU tasks. jiant enables modular and configuration-driven experimentation with state-of-the-art models and implements a broad set of tasks for probing, transfer learning, and multitask training experiments. jiant implements over 50 NLU tasks, including all GLUE and SuperGLUE benchmark tasks. We demonstrate that jiant reproduces published performance on a variety of tasks and models, including BERT and RoBERTa. jiant is available at https://jiant.info.

Via

Access Paper or Ask Questions

SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

May 02, 2019

Alex Wang, Yada Pruksachatkun, Nikita Nangia, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, Samuel R. Bowman

Figure 1 for SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Figure 2 for SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Figure 3 for SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Figure 4 for SuperGLUE: A Stickier Benchmark for General-Purpose Language Understanding Systems

Abstract:In the last year, new models and methods for pretraining and transfer learning have driven striking performance improvements across a range of language understanding tasks. The GLUE benchmark, introduced one year ago, offers a single-number metric that summarizes progress on a diverse set of such tasks, but performance on the benchmark has recently come close to the level of non-expert humans, suggesting limited headroom for further research. This paper recaps lessons learned from the GLUE benchmark and presents SuperGLUE, a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, improved resources, and a new public leaderboard. SuperGLUE will be available soon at super.gluebenchmark.com.

* super.gluebenchmark.com

Via

Access Paper or Ask Questions