Picture for Grace Kim

Grace Kim

GDPval: Evaluating AI Model Performance on Real-World Economically Valuable Tasks

Add code
Oct 05, 2025
Viaarxiv icon

ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Add code
May 19, 2025
Viaarxiv icon

Is the Top Still Spinning? Evaluating Subjectivity in Narrative Understanding

Add code
Apr 01, 2025
Viaarxiv icon

Space for Improvement: Navigating the Design Space for Federated Learning in Satellite Constellations

Add code
Oct 31, 2024
Viaarxiv icon

Complex Claim Verification with Evidence Retrieved in the Wild

Add code
May 19, 2023
Viaarxiv icon