Alert button
Picture for William Huang

William Huang

Alert button

Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair

Nov 16, 2021
Jason Phang, Angelica Chen, William Huang, Samuel R. Bowman

Figure 1 for Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Figure 2 for Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Figure 3 for Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Figure 4 for Adversarially Constructed Evaluation Sets Are More Challenging, but May Not Be Fair
Viaarxiv icon

Types of Out-of-Distribution Texts and How to Detect Them

Sep 14, 2021
Udit Arora, William Huang, He He

Figure 1 for Types of Out-of-Distribution Texts and How to Detect Them
Figure 2 for Types of Out-of-Distribution Texts and How to Detect Them
Figure 3 for Types of Out-of-Distribution Texts and How to Detect Them
Figure 4 for Types of Out-of-Distribution Texts and How to Detect Them
Viaarxiv icon

Comparing Test Sets with Item Response Theory

Jun 01, 2021
Clara Vania, Phu Mon Htut, William Huang, Dhara Mungra, Richard Yuanzhe Pang, Jason Phang, Haokun Liu, Kyunghyun Cho, Samuel R. Bowman

Figure 1 for Comparing Test Sets with Item Response Theory
Figure 2 for Comparing Test Sets with Item Response Theory
Figure 3 for Comparing Test Sets with Item Response Theory
Figure 4 for Comparing Test Sets with Item Response Theory
Viaarxiv icon

Does Putting a Linguist in the Loop Improve NLU Data Collection?

Apr 15, 2021
Alicia Parrish, William Huang, Omar Agha, Soo-Hwan Lee, Nikita Nangia, Alex Warstadt, Karmanya Aggarwal, Emily Allaway, Tal Linzen, Samuel R. Bowman

Figure 1 for Does Putting a Linguist in the Loop Improve NLU Data Collection?
Figure 2 for Does Putting a Linguist in the Loop Improve NLU Data Collection?
Figure 3 for Does Putting a Linguist in the Loop Improve NLU Data Collection?
Figure 4 for Does Putting a Linguist in the Loop Improve NLU Data Collection?
Viaarxiv icon

Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data

Oct 09, 2020
William Huang, Haokun Liu, Samuel R. Bowman

Figure 1 for Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data
Figure 2 for Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data
Figure 3 for Counterfactually-Augmented SNLI Training Data Does Not Yield Better Generalization Than Unaugmented Data
Viaarxiv icon

Precise Task Formalization Matters in Winograd Schema Evaluations

Oct 08, 2020
Haokun Liu, William Huang, Dhara A. Mungra, Samuel R. Bowman

Figure 1 for Precise Task Formalization Matters in Winograd Schema Evaluations
Figure 2 for Precise Task Formalization Matters in Winograd Schema Evaluations
Figure 3 for Precise Task Formalization Matters in Winograd Schema Evaluations
Viaarxiv icon