Picture for Shihan Dou

Shihan Dou

What's Wrong with Your Code Generated by Large Language Models? An Extensive Study

Add code
Jul 08, 2024
Viaarxiv icon

SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance

Add code
Jun 26, 2024
Viaarxiv icon

Aligning Large Language Models from Self-Reference AI Feedback with one General Principle

Add code
Jun 17, 2024
Figure 1 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Figure 2 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Figure 3 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Figure 4 for Aligning Large Language Models from Self-Reference AI Feedback with one General Principle
Viaarxiv icon

MetaRM: Shifted Distributions Alignment via Meta-Learning

Add code
May 01, 2024
Viaarxiv icon

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

Add code
Mar 18, 2024
Figure 1 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Figure 2 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Figure 3 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Figure 4 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Viaarxiv icon

Advancing Translation Preference Modeling with RLHF: A Step Towards Cost-Effective Solution

Add code
Feb 27, 2024
Viaarxiv icon

CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models

Add code
Feb 26, 2024
Viaarxiv icon

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Add code
Feb 08, 2024
Viaarxiv icon

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Add code
Feb 05, 2024
Viaarxiv icon

MouSi: Poly-Visual-Expert Vision-Language Models

Add code
Jan 30, 2024
Viaarxiv icon