Picture for Jeff Z. Pan

Jeff Z. Pan

Mind-Studio: Executable World Models with Lookahead Evaluation for Partially Observable Games

Add code
Jun 16, 2026
Viaarxiv icon

Bridging the Agent-World Gap: Text World Models for LLM-based Agents

Add code
Jun 08, 2026
Viaarxiv icon

Multi-Turn Evaluation of Deep Research Agents Under Process-Level Feedback

Add code
Jun 08, 2026
Viaarxiv icon

Terminal-World: Scaling Terminal-Agent Environments via Agent Skills

Add code
May 20, 2026
Viaarxiv icon

InnoEval: On Research Idea Evaluation as a Knowledge-Grounded, Multi-Perspective Reasoning Problem

Add code
Feb 16, 2026
Viaarxiv icon

Chain Of Thought Compression: A Theoritical Analysis

Add code
Jan 29, 2026
Viaarxiv icon

LogicScore: Fine-grained Logic Evaluation of Conciseness, Completeness, and Determinateness in Attributed Question Answering

Add code
Jan 22, 2026
Viaarxiv icon

\textsc{LogicScore}: Fine-grained Logic Evaluation of Conciseness, Completeness, and Determinateness in Attributed Question Answering

Add code
Jan 21, 2026
Viaarxiv icon

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Add code
Jan 09, 2026
Viaarxiv icon

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Add code
Dec 23, 2025
Viaarxiv icon