Picture for Aviral Kumar

Aviral Kumar

RLAD: Training LLMs to Discover Abstractions for Solving Reasoning Problems

Add code
Oct 02, 2025
Viaarxiv icon

RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction

Add code
Sep 09, 2025
Viaarxiv icon

Compute-Optimal Scaling for Value-Based Deep RL

Add code
Aug 20, 2025
Viaarxiv icon

Reasoning as an Adaptive Defense for Safety

Add code
Jul 01, 2025
Viaarxiv icon

e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs

Add code
Jun 10, 2025
Viaarxiv icon

Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction

Add code
Jun 09, 2025
Viaarxiv icon

Horizon Reduction Makes RL Scalable

Add code
Jun 08, 2025
Viaarxiv icon

Grounded Reinforcement Learning for Visual Reasoning

Add code
May 29, 2025
Viaarxiv icon

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Add code
May 29, 2025
Viaarxiv icon

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Add code
Mar 10, 2025
Viaarxiv icon