Abstract:Generative Reward Models (GRMs) have attracted considerable research interest in reward modeling due to their interpretability, inference-time scalability, and potential for refinement through reinforcement learning (RL). However, widely used pairwise GRMs create a computational bottleneck when integrated with RL algorithms such as Group Relative Policy Optimization (GRPO). This bottleneck arises from two factors: (i) the O(n^2) time complexity of pairwise comparisons required to obtain relative scores, and (ii) the computational overhead of repeated sampling or additional chain-of-thought (CoT) reasoning to improve performance. To address the first factor, we propose Intergroup Relative Preference Optimization (IRPO), a novel RL framework that incorporates the well-established Bradley-Terry model into GRPO. By generating a pointwise score for each response, IRPO enables efficient evaluation of arbitrarily many candidates during RL training while preserving interpretability and fine-grained reward signals. Experimental results demonstrate that IRPO achieves state-of-the-art (SOTA) performance among pointwise GRMs across multiple benchmarks, with performance comparable to that of current leading pairwise GRMs. Furthermore, we show that IRPO significantly outperforms pairwise GRMs in post-training evaluations.




Abstract:The proliferation of Deep Neural Networks (DNN) in commercial applications is expanding rapidly. Simultaneously, the increasing complexity and cost of training DNN models have intensified the urgency surrounding the protection of intellectual property associated with these trained models. In this regard, DNN watermarking has emerged as a crucial safeguarding technique. This paper presents FedReverse, a novel multiparty reversible watermarking approach for robust copyright protection while minimizing performance impact. Unlike existing methods, FedReverse enables collaborative watermark embedding from multiple parties after model training, ensuring individual copyright claims. In addition, FedReverse is reversible, enabling complete watermark removal with unanimous client consent. FedReverse demonstrates perfect covering, ensuring that observations of watermarked content do not reveal any information about the hidden watermark. Additionally, it showcases resistance against Known Original Attacks (KOA), making it highly challenging for attackers to forge watermarks or infer the key. This paper further evaluates FedReverse through comprehensive simulations involving Multi-layer Perceptron (MLP) and Convolutional Neural Networks (CNN) trained on the MNIST dataset. The simulations demonstrate FedReverse's robustness, reversibility, and minimal impact on model accuracy across varying embedding parameters and multiple client scenarios.