Picture for Kehan Jiang

Kehan Jiang

Agent-ValueBench: A Comprehensive Benchmark for Evaluating Agent Values

Add code
May 11, 2026
Viaarxiv icon

NeuReasoner: Towards Explainable, Controllable, and Unified Reasoning via Mixture-of-Neurons

Add code
Apr 03, 2026
Viaarxiv icon

FoE: Forest of Errors Makes the First Solution the Best in Large Reasoning Models

Add code
Apr 03, 2026
Viaarxiv icon

How Order-Sensitive Are LLMs? OrderProbe for Deterministic Structural Reconstruction

Add code
Jan 13, 2026
Viaarxiv icon

QuantEval: A Benchmark for Financial Quantitative Tasks in Large Language Models

Add code
Jan 13, 2026
Viaarxiv icon

Meta-R1: Empowering Large Reasoning Models with Metacognition

Add code
Aug 24, 2025
Viaarxiv icon