Picture for Rishabh Agarwal

Rishabh Agarwal

Google Research Brain Team

Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling

Add code
Oct 15, 2024
Viaarxiv icon

Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning

Add code
Oct 10, 2024
Viaarxiv icon

Not All LLM Reasoners Are Created Equal

Add code
Oct 02, 2024
Figure 1 for Not All LLM Reasoners Are Created Equal
Figure 2 for Not All LLM Reasoners Are Created Equal
Figure 3 for Not All LLM Reasoners Are Created Equal
Figure 4 for Not All LLM Reasoners Are Created Equal
Viaarxiv icon

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling

Add code
Aug 29, 2024
Viaarxiv icon

Generative Verifiers: Reward Modeling as Next-Token Prediction

Add code
Aug 27, 2024
Viaarxiv icon

Gemma 2: Improving Open Language Models at a Practical Size

Add code
Aug 02, 2024
Figure 1 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 2 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 3 for Gemma 2: Improving Open Language Models at a Practical Size
Figure 4 for Gemma 2: Improving Open Language Models at a Practical Size
Viaarxiv icon

Don't Throw Away Data: Better Sequence Knowledge Distillation

Add code
Jul 15, 2024
Viaarxiv icon

On scalable oversight with weak LLMs judging strong LLMs

Add code
Jul 05, 2024
Viaarxiv icon

SiT: Symmetry-Invariant Transformers for Generalisation in Reinforcement Learning

Add code
Jun 21, 2024
Viaarxiv icon

Many-Shot In-Context Learning

Add code
Apr 17, 2024
Viaarxiv icon