Picture for Ruohui Huang

Ruohui Huang

Rubrics to Tokens: Bridging Response-level Rubrics and Token-level Rewards in Instruction Following Tasks

Add code
Apr 03, 2026
Viaarxiv icon

SERL: Self-Examining Reinforcement Learning on Open-Domain

Add code
Nov 18, 2025
Viaarxiv icon

Conifer: Improving Complex Constrained Instruction-Following Ability of Large Language Models

Add code
Apr 03, 2024
Viaarxiv icon