Picture for Tianjun Pan

Tianjun Pan

OPSDL: On-Policy Self-Distillation for Long-Context Language Models

Add code
Apr 19, 2026
Viaarxiv icon

SEA-Eval: A Benchmark for Evaluating Self-Evolving Agents Beyond Episodic Assessment

Add code
Apr 14, 2026
Viaarxiv icon

RubricEval: A Rubric-Level Meta-Evaluation Benchmark for LLM Judges in Instruction Following

Add code
Mar 26, 2026
Viaarxiv icon