Picture for Graham Neubig

Graham Neubig

Carnegie Mellon University

Grounding Multilingual Multimodal LLMs With Cultural Knowledge

Add code
Aug 12, 2025
Viaarxiv icon

SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model

Add code
Jul 31, 2025
Viaarxiv icon

Checklists Are Better Than Reward Models For Aligning Language Models

Add code
Jul 24, 2025
Viaarxiv icon

OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety

Add code
Jul 08, 2025
Viaarxiv icon

ZINA: Multimodal Fine-grained Hallucination Detection and Editing

Add code
Jun 16, 2025
Viaarxiv icon

CAIRe: Cultural Attribution of Images by Retrieval-Augmented Evaluation

Add code
Jun 10, 2025
Viaarxiv icon

FieldWorkArena: Agentic AI Benchmark for Real Field Work Tasks

Add code
May 26, 2025
Viaarxiv icon

The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think

Add code
May 15, 2025
Viaarxiv icon

VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge

Add code
Apr 15, 2025
Viaarxiv icon

Do LLMs Understand Your Translations? Evaluating Paragraph-level MT with Question Answering

Add code
Apr 10, 2025
Viaarxiv icon