Picture for Niccolò Gentile

Niccolò Gentile

Shaping Explanations: Semantic Reward Modeling with Encoder-Only Transformers for GRPO

Add code
Sep 16, 2025
Viaarxiv icon