Alert button
Picture for Anjana Arunkumar

Anjana Arunkumar

Alert button

LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity

Add code
Bookmark button
Alert button
Apr 12, 2023
Anjana Arunkumar, Shubham Sharma, Rakhi Agrawal, Sriram Chandrasekaran, Chris Bryan

Figure 1 for LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Figure 2 for LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Figure 3 for LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Figure 4 for LINGO : Visually Debiasing Natural Language Instructions to Support Task Diversity
Viaarxiv icon

Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow

Add code
Bookmark button
Alert button
Feb 09, 2023
Anjana Arunkumar, Swaroop Mishra, Bhavdeep Sachdeva, Chitta Baral, Chris Bryan

Figure 1 for Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow
Figure 2 for Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow
Figure 3 for Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow
Figure 4 for Real-Time Visual Feedback to Guide Benchmark Creation: A Human-and-Metric-in-the-Loop Workflow
Viaarxiv icon

Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task

Add code
Bookmark button
Alert button
Oct 14, 2022
Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

Figure 1 for Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task
Figure 2 for Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task
Figure 3 for Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task
Figure 4 for Hardness of Samples Need to be Quantified for a Reliable Evaluation System: Exploring Potential Opportunities with a New Task
Viaarxiv icon

A Survey of Parameters Associated with the Quality of Benchmarks in NLP

Add code
Bookmark button
Alert button
Oct 14, 2022
Swaroop Mishra, Anjana Arunkumar, Chris Bryan, Chitta Baral

Figure 1 for A Survey of Parameters Associated with the Quality of Benchmarks in NLP
Figure 2 for A Survey of Parameters Associated with the Quality of Benchmarks in NLP
Figure 3 for A Survey of Parameters Associated with the Quality of Benchmarks in NLP
Figure 4 for A Survey of Parameters Associated with the Quality of Benchmarks in NLP
Viaarxiv icon

Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications

Add code
Bookmark button
Alert button
Oct 10, 2022
Swaroop Mishra, Anjana Arunkumar, Chitta Baral

Figure 1 for Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications
Figure 2 for Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications
Figure 3 for Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications
Figure 4 for Investigating the Failure Modes of the AUC metric and Exploring Alternatives for Evaluating Systems in Safety Critical Applications
Viaarxiv icon

Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks

Add code
Bookmark button
Alert button
Apr 16, 2022
Yizhong Wang, Swaroop Mishra, Pegah Alipoormolabashi, Yeganeh Kordi, Amirreza Mirzaei, Anjana Arunkumar, Arjun Ashok, Arut Selvan Dhanasekaran, Atharva Naik, David Stap, Eshaan Pathak, Giannis Karamanolakis, Haizhi Gary Lai, Ishan Purohit, Ishani Mondal, Jacob Anderson, Kirby Kuznia, Krima Doshi, Maitreya Patel, Kuntal Kumar Pal, Mehrad Moradshahi, Mihir Parmar, Mirali Purohit, Neeraj Varshney, Phani Rohitha Kaza, Pulkit Verma, Ravsehaj Singh Puri, Rushang Karia, Shailaja Keyur Sampat, Savan Doshi, Siddhartha Mishra, Sujan Reddy, Sumanta Patro, Tanay Dixit, Xudong Shen, Chitta Baral, Yejin Choi, Hannaneh Hajishirzi, Noah A. Smith, Daniel Khashabi

Figure 1 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Figure 2 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Figure 3 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Figure 4 for Benchmarking Generalization via In-Context Instructions on 1,600+ Language Tasks
Viaarxiv icon

A Proposal to Study "Is High Quality Data All We Need?"

Add code
Bookmark button
Alert button
Mar 12, 2022
Swaroop Mishra, Anjana Arunkumar

Figure 1 for A Proposal to Study "Is High Quality Data All We Need?"
Figure 2 for A Proposal to Study "Is High Quality Data All We Need?"
Viaarxiv icon

Front Contribution instead of Back Propagation

Add code
Bookmark button
Alert button
Jun 10, 2021
Swaroop Mishra, Anjana Arunkumar

Figure 1 for Front Contribution instead of Back Propagation
Figure 2 for Front Contribution instead of Back Propagation
Figure 3 for Front Contribution instead of Back Propagation
Figure 4 for Front Contribution instead of Back Propagation
Viaarxiv icon

How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation

Add code
Bookmark button
Alert button
Jun 10, 2021
Swaroop Mishra, Anjana Arunkumar

Figure 1 for How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation
Figure 2 for How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation
Figure 3 for How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation
Figure 4 for How Robust are Model Rankings: A Leaderboard Customization Approach for Equitable Evaluation
Viaarxiv icon

DQI: A Guide to Benchmark Evaluation

Add code
Bookmark button
Alert button
Aug 10, 2020
Swaroop Mishra, Anjana Arunkumar, Bhavdeep Sachdeva, Chris Bryan, Chitta Baral

Figure 1 for DQI: A Guide to Benchmark Evaluation
Figure 2 for DQI: A Guide to Benchmark Evaluation
Figure 3 for DQI: A Guide to Benchmark Evaluation
Figure 4 for DQI: A Guide to Benchmark Evaluation
Viaarxiv icon