Picture for Rishabh Agarwal

Rishabh Agarwal

Google Research Brain Team

On scalable oversight with weak LLMs judging strong LLMs

Add code
Jul 05, 2024
Viaarxiv icon

SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning

Add code
Jun 21, 2024
Viaarxiv icon

Many-Shot In-Context Learning

Add code
Apr 17, 2024
Viaarxiv icon

Stop Regressing: Training Value Functions via Classification for Scalable Deep RL

Add code
Mar 06, 2024
Figure 1 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Figure 2 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Figure 3 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Figure 4 for Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Viaarxiv icon

Transformers Can Achieve Length Generalization But Not Robustly

Add code
Feb 14, 2024
Viaarxiv icon

V-STaR: Training Verifiers for Self-Taught Reasoners

Add code
Feb 09, 2024
Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Dec 22, 2023
Figure 1 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 2 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 3 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 4 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Viaarxiv icon

Learning and Controlling Silicon Dopant Transitions in Graphene using Scanning Transmission Electron Microscopy

Add code
Nov 21, 2023
Viaarxiv icon

Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research

Add code
Oct 12, 2023
Figure 1 for Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research
Figure 2 for Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research
Figure 3 for Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research
Figure 4 for Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research
Viaarxiv icon

DistillSpec: Improving Speculative Decoding via Knowledge Distillation

Add code
Oct 12, 2023
Figure 1 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 2 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 3 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 4 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Viaarxiv icon