Picture for Aviral Kumar

Aviral Kumar

RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction

Add code
Sep 09, 2025
Figure 1 for RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction
Figure 2 for RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction
Figure 3 for RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction
Figure 4 for RaC: Robot Learning for Long-Horizon Tasks by Scaling Recovery and Correction
Viaarxiv icon

Compute-Optimal Scaling for Value-Based Deep RL

Add code
Aug 20, 2025
Viaarxiv icon

Reasoning as an Adaptive Defense for Safety

Add code
Jul 01, 2025
Figure 1 for Reasoning as an Adaptive Defense for Safety
Figure 2 for Reasoning as an Adaptive Defense for Safety
Figure 3 for Reasoning as an Adaptive Defense for Safety
Figure 4 for Reasoning as an Adaptive Defense for Safety
Viaarxiv icon

e3: Learning to Explore Enables Extrapolation of Test-Time Compute for LLMs

Add code
Jun 10, 2025
Viaarxiv icon

Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction

Add code
Jun 09, 2025
Figure 1 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Figure 2 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Figure 3 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Figure 4 for Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction
Viaarxiv icon

Horizon Reduction Makes RL Scalable

Add code
Jun 08, 2025
Figure 1 for Horizon Reduction Makes RL Scalable
Figure 2 for Horizon Reduction Makes RL Scalable
Figure 3 for Horizon Reduction Makes RL Scalable
Figure 4 for Horizon Reduction Makes RL Scalable
Viaarxiv icon

Bigger, Regularized, Categorical: High-Capacity Value Functions are Efficient Multi-Task Learners

Add code
May 29, 2025
Viaarxiv icon

Grounded Reinforcement Learning for Visual Reasoning

Add code
May 29, 2025
Figure 1 for Grounded Reinforcement Learning for Visual Reasoning
Figure 2 for Grounded Reinforcement Learning for Visual Reasoning
Figure 3 for Grounded Reinforcement Learning for Visual Reasoning
Figure 4 for Grounded Reinforcement Learning for Visual Reasoning
Viaarxiv icon

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Add code
Mar 10, 2025
Figure 1 for Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Figure 2 for Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Figure 3 for Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Figure 4 for Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning
Viaarxiv icon

Scaling Test-Time Compute Without Verification or RL is Suboptimal

Add code
Feb 18, 2025
Figure 1 for Scaling Test-Time Compute Without Verification or RL is Suboptimal
Figure 2 for Scaling Test-Time Compute Without Verification or RL is Suboptimal
Figure 3 for Scaling Test-Time Compute Without Verification or RL is Suboptimal
Figure 4 for Scaling Test-Time Compute Without Verification or RL is Suboptimal
Viaarxiv icon