Picture for Gabriel Stanovsky

Gabriel Stanovsky

Looking Beyond The Top-1: Transformers Determine Top Tokens In Order

Add code
Oct 26, 2024
Figure 1 for Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Figure 2 for Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Figure 3 for Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Figure 4 for Looking Beyond The Top-1: Transformers Determine Top Tokens In Order
Viaarxiv icon

Surveying the Landscape of Image Captioning Evaluation: A Comprehensive Taxonomy and Novel Ensemble Method

Add code
Aug 09, 2024
Viaarxiv icon

SEAM: A Stochastic Benchmark for Multi-Document Tasks

Add code
Jun 23, 2024
Figure 1 for SEAM: A Stochastic Benchmark for Multi-Document Tasks
Figure 2 for SEAM: A Stochastic Benchmark for Multi-Document Tasks
Figure 3 for SEAM: A Stochastic Benchmark for Multi-Document Tasks
Figure 4 for SEAM: A Stochastic Benchmark for Multi-Document Tasks
Viaarxiv icon

In-Context Learning on a Budget: A Case Study in Named Entity Recognition

Add code
Jun 19, 2024
Viaarxiv icon

Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation

Add code
Jun 02, 2024
Figure 1 for Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation
Figure 2 for Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation
Figure 3 for Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation
Figure 4 for Applying Intrinsic Debiasing on Downstream Tasks: Challenges and Considerations for Machine Translation
Viaarxiv icon

A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns

Add code
May 23, 2024
Viaarxiv icon

Do Zombies Understand? A Choose-Your-Own-Adventure Exploration of Machine Cognition

Add code
Mar 01, 2024
Viaarxiv icon

Leveraging Collection-Wide Similarities for Unsupervised Document Structure Extraction

Add code
Feb 21, 2024
Viaarxiv icon

K-QA: A Real-World Medical Q&A Benchmark

Add code
Jan 25, 2024
Figure 1 for K-QA: A Real-World Medical Q&A Benchmark
Figure 2 for K-QA: A Real-World Medical Q&A Benchmark
Figure 3 for K-QA: A Real-World Medical Q&A Benchmark
Figure 4 for K-QA: A Real-World Medical Q&A Benchmark
Viaarxiv icon

State of What Art? A Call for Multi-Prompt LLM Evaluation

Add code
Dec 31, 2023
Viaarxiv icon