Alert button
Picture for JinJin Li

JinJin Li

Alert button

RS-DPO: A Hybrid Rejection Sampling and Direct Preference Optimization Method for Alignment of Large Language Models

Add code
Bookmark button
Alert button
Feb 15, 2024
Saeed Khaki, JinJin Li, Lan Ma, Liu Yang, Prathap Ramachandra

Viaarxiv icon