Picture for Enyu Zhou

Enyu Zhou

JFTA-Bench: Evaluate LLM's Ability of Tracking and Analyzing Malfunctions Using Fault Trees

Add code
Mar 24, 2026
Viaarxiv icon

Steering LLMs via Scalable Interactive Oversight

Add code
Feb 04, 2026
Viaarxiv icon

FRoM-W1: Towards General Humanoid Whole-Body Control with Language Instructions

Add code
Jan 19, 2026
Viaarxiv icon

RMB: Comprehensively Benchmarking Reward Models in LLM Alignment

Add code
Oct 13, 2024
Figure 1 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 2 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 3 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Figure 4 for RMB: Comprehensively Benchmarking Reward Models in LLM Alignment
Viaarxiv icon

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

Add code
Jul 08, 2024
Figure 1 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 2 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 3 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Figure 4 for What's Wrong with Your Code Generated by Large Language Models? An Extensive Study
Viaarxiv icon

SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance

Add code
Jun 26, 2024
Figure 1 for SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
Figure 2 for SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
Figure 3 for SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
Figure 4 for SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance
Viaarxiv icon

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

Add code
Jun 17, 2024
Figure 1 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Figure 2 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Figure 3 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Figure 4 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Viaarxiv icon

MetaRM: Shifted Distributions Alignment via Meta-Learning

Add code
May 01, 2024
Figure 1 for MetaRM: Shifted Distributions Alignment via Meta-Learning
Figure 2 for MetaRM: Shifted Distributions Alignment via Meta-Learning
Figure 3 for MetaRM: Shifted Distributions Alignment via Meta-Learning
Figure 4 for MetaRM: Shifted Distributions Alignment via Meta-Learning
Viaarxiv icon

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Add code
Feb 05, 2024
Figure 1 for StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
Figure 2 for StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
Figure 3 for StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
Figure 4 for StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback
Viaarxiv icon

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Add code
Jan 12, 2024
Figure 1 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Figure 2 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Figure 3 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Figure 4 for Secrets of RLHF in Large Language Models Part II: Reward Modeling
Viaarxiv icon