Picture for Yuming

Yuming

Rapheal

Let the Results Speak: A Replication-First Paradigm for LLM Behavioral Benchmarking

Add code
May 27, 2026
Viaarxiv icon