Picture for Yuheng Wu

Yuheng Wu

DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic

Add code
May 22, 2025
Viaarxiv icon

SATBench: Benchmarking LLMs' Logical Reasoning via Automated Puzzle Generation from SAT Formulas

Add code
May 20, 2025
Viaarxiv icon

LangCoop: Collaborative Driving with Language

Add code
Apr 21, 2025
Viaarxiv icon

Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models

Add code
Apr 05, 2025
Viaarxiv icon

CodeARC: Benchmarking Reasoning Capabilities of LLM Agents for Inductive Program Synthesis

Add code
Mar 29, 2025
Viaarxiv icon

Competent but Rigid: Identifying the Gap in Empowering AI to Participate Equally in Group Decision-Making

Add code
Feb 17, 2023
Viaarxiv icon