Picture for Jiayi Zhou

Jiayi Zhou

Sequence to Sequence Reward Modeling: Improving RLHF by Language Feedback

Add code
Aug 30, 2024
Viaarxiv icon

Improving GNSS Positioning in Challenging Urban Areas by Digital Twin Database Correction

Add code
Aug 25, 2024
Viaarxiv icon

Language Models Resist Alignment

Add code
Jun 10, 2024
Figure 1 for Language Models Resist Alignment
Figure 2 for Language Models Resist Alignment
Figure 3 for Language Models Resist Alignment
Figure 4 for Language Models Resist Alignment
Viaarxiv icon

Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective

Add code
Feb 20, 2024
Viaarxiv icon

AI Alignment: A Comprehensive Survey

Add code
Nov 01, 2023
Viaarxiv icon

Safety-Gymnasium: A Unified Safe Reinforcement Learning Benchmark

Add code
Oct 19, 2023
Viaarxiv icon

OmniSafe: An Infrastructure for Accelerating Safe Reinforcement Learning Research

Add code
May 16, 2023
Viaarxiv icon