Picture for Chenxi Whitehouse

Chenxi Whitehouse

Autodata: An agentic data scientist to create high quality synthetic data

Add code
Jun 25, 2026
Viaarxiv icon

Summarization is Not Dead Yet

Add code
Jun 06, 2026
Viaarxiv icon

Reasoning over mathematical objects: on-policy reward modeling and test time aggregation

Add code
Mar 19, 2026
Viaarxiv icon

Text-to-Stage: Spatial Layouts from Long-form Narratives

Add code
Mar 18, 2026
Viaarxiv icon

APRES: An Agentic Paper Revision and Evaluation System

Add code
Mar 03, 2026
Viaarxiv icon

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

Add code
Feb 18, 2026
Viaarxiv icon

Macaron: Controlled, Human-Written Benchmark for Multilingual and Multicultural Reasoning via Template-Filling

Add code
Feb 11, 2026
Viaarxiv icon

Rethinking Rubric Generation for Improving LLM Judge and Reward Modeling for Open-ended Tasks

Add code
Feb 04, 2026
Viaarxiv icon

The Llama 4 Herd: Architecture, Training, Evaluation, and Deployment Notes

Add code
Jan 15, 2026
Viaarxiv icon

Training AI Co-Scientists Using Rubric Rewards

Add code
Dec 29, 2025
Viaarxiv icon