Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework

May 11, 2026

Xilai Ma, Liye Zhao, Weijun Yao, Haibing Di, Wenya Wang, Jing Li

Share this with someone who'll enjoy it:

Abstract:Large Language Model (LLM) personalization aims to align model behaviors with individual user preferences. Existing methods often focus on isolated user histories, neglecting the essential role of inter-user differences. We propose C-BPO, a framework that personalizes LLMs via preference-calibrated binary signals. By treating target user data as positive feedback and other users' data as an auxiliary set of implicit negative signals, C-BPO captures distinct inter-user differences. To mitigate the preference overlap issue, where shared task knowledge is erroneously penalized, we derive an objective grounded in Positive-Unlabeled (PU) learning theory. This approach purifies negative signals by subtracting ``positive bias'', ensuring alignment with unique idiosyncrasies without compromising general helpfulness. Empirical experiments across various personalization tasks and backbone LLMs show C-BPO consistently outperforms baselines, demonstrating the efficacy of preference-calibrated binary signals in modeling inter-user differences.

* Accepted by ACL 2026 Main

View paper on

Share this with someone who'll enjoy it:

Title:Personalizing LLMs with Binary Feedback: A Preference-Corrected Optimization Framework

Paper and Code