Alert button
Picture for Shengyi Huang

Shengyi Huang

Alert button

The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization

Add code
Bookmark button
Alert button
Mar 24, 2024
Shengyi Huang, Michael Noukhovitch, Arian Hosseini, Kashif Rasul, Weixun Wang, Lewis Tunstall

Viaarxiv icon

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 05, 2024
Shengyi Huang, Quentin Gallouédec, Florian Felten, Antonin Raffin, Rousslan Fernand Julien Dossa, Yanxiao Zhao, Ryan Sullivan, Viktor Makoviychuk, Denys Makoviichuk, Mohamad H. Danesh, Cyril Roumégous, Jiayi Weng, Chufan Chen, Md Masudur Rahman, João G. M. Araújo, Guorui Quan, Daniel Tan, Timo Klein, Rujikorn Charakorn, Mark Towers, Yann Berthelot, Kinal Mehta, Dipam Chakraborty, Arjun KG, Valentin Charraut, Chang Ye, Zichen Liu, Lucas N. Alegre, Alexander Nikulin, Xiao Hu, Tianlin Liu, Jongwook Choi, Brent Yi

Viaarxiv icon

Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks

Add code
Bookmark button
Alert button
Oct 26, 2023
Ryan Sullivan, Akarsh Kumar, Shengyi Huang, John P. Dickerson, Joseph Suarez

Viaarxiv icon

Zephyr: Direct Distillation of LM Alignment

Add code
Bookmark button
Alert button
Oct 25, 2023
Lewis Tunstall, Edward Beeching, Nathan Lambert, Nazneen Rajani, Kashif Rasul, Younes Belkada, Shengyi Huang, Leandro von Werra, Clémentine Fourrier, Nathan Habib, Nathan Sarrazin, Omar Sanseviero, Alexander M. Rush, Thomas Wolf

Figure 1 for Zephyr: Direct Distillation of LM Alignment
Figure 2 for Zephyr: Direct Distillation of LM Alignment
Figure 3 for Zephyr: Direct Distillation of LM Alignment
Figure 4 for Zephyr: Direct Distillation of LM Alignment
Viaarxiv icon

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform

Add code
Bookmark button
Alert button
Sep 29, 2023
Shengyi Huang, Jiayi Weng, Rujikorn Charakorn, Min Lin, Zhongwen Xu, Santiago Ontañón

Viaarxiv icon

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine

Add code
Bookmark button
Alert button
Jun 21, 2022
Jiayi Weng, Min Lin, Shengyi Huang, Bo Liu, Denys Makoviichuk, Viktor Makoviychuk, Zichen Liu, Yufan Song, Ting Luo, Yukun Jiang, Zhongwen Xu, Shuicheng Yan

Figure 1 for EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine
Figure 2 for EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine
Figure 3 for EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine
Figure 4 for EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine
Viaarxiv icon

A2C is a special case of PPO

Add code
Bookmark button
Alert button
May 18, 2022
Shengyi Huang, Anssi Kanervisto, Antonin Raffin, Weixun Wang, Santiago Ontañón, Rousslan Fernand Julien Dossa

Figure 1 for A2C is a special case of PPO
Viaarxiv icon

CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms

Add code
Bookmark button
Alert button
Nov 16, 2021
Shengyi Huang, Rousslan Fernand Julien Dossa, Chang Ye, Jeff Braga

Figure 1 for CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms
Figure 2 for CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms
Viaarxiv icon

Gym-$μ$RTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning

Add code
Bookmark button
Alert button
May 21, 2021
Shengyi Huang, Santiago Ontañón, Chris Bamford, Lukasz Grela

Figure 1 for Gym-$μ$RTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning
Figure 2 for Gym-$μ$RTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning
Figure 3 for Gym-$μ$RTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning
Figure 4 for Gym-$μ$RTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning
Viaarxiv icon

Griddly: A platform for AI research in games

Add code
Bookmark button
Alert button
Nov 21, 2020
Chris Bamford, Shengyi Huang, Simon Lucas

Figure 1 for Griddly: A platform for AI research in games
Figure 2 for Griddly: A platform for AI research in games
Figure 3 for Griddly: A platform for AI research in games
Figure 4 for Griddly: A platform for AI research in games
Viaarxiv icon