Alert button
Picture for Wei Shen

Wei Shen

Alert button

EndoGSLAM: Real-Time Dense Reconstruction and Tracking in Endoscopic Surgeries using Gaussian Splatting

Add code
Bookmark button
Alert button
Mar 22, 2024
Kailing Wang, Chen Yang, Yuehao Wang, Sikuang Li, Yan Wang, Qi Dou, Xiaokang Yang, Wei Shen

Viaarxiv icon

Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards

Add code
Bookmark button
Alert button
Mar 14, 2024
Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu

Figure 1 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 2 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 3 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Figure 4 for Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards
Viaarxiv icon

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

Add code
Bookmark button
Alert button
Mar 08, 2024
Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu

Figure 1 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 2 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 3 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Viaarxiv icon

GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting

Add code
Bookmark button
Alert button
Feb 20, 2024
Chen Yang, Sikuang Li, Jiemin Fang, Ruofan Liang, Lingxi Xie, Xiaopeng Zhang, Wei Shen, Qi Tian

Viaarxiv icon

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 08, 2024
Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang

Viaarxiv icon

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Add code
Bookmark button
Alert button
Feb 05, 2024
Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui

Viaarxiv icon

ViTree: Single-path Neural Tree for Step-wise Interpretable Fine-grained Visual Categorization

Add code
Bookmark button
Alert button
Jan 30, 2024
Danning Lao, Qi Liu, Jiazi Bu, Junchi Yan, Wei Shen

Viaarxiv icon

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

Add code
Bookmark button
Alert button
Jan 21, 2024
Songyang Gao, Qiming Ge, Wei Shen, Shihan Dou, Junjie Ye, Xiao Wang, Rui Zheng, Yicheng Zou, Zhi Chen, Hang Yan, Qi Zhang, Dahua Lin

Viaarxiv icon

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Add code
Bookmark button
Alert button
Jan 12, 2024
Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

Viaarxiv icon