Picture for Zheng Wen

Zheng Wen

Multi-Objective Preference Optimization: Improving Human Alignment of Generative Models

Add code
May 16, 2025
Viaarxiv icon

Best Policy Learning from Trajectory Preference Feedback

Add code
Jan 31, 2025
Viaarxiv icon

Evaluating Large Language Models on Financial Report Summarization: An Empirical Study

Add code
Nov 11, 2024
Viaarxiv icon

Online Bandit Learning with Offline Preference Data

Add code
Jun 13, 2024
Viaarxiv icon

RLHF and IIA: Perverse Incentives

Add code
Dec 02, 2023
Figure 1 for RLHF and IIA: Perverse Incentives
Figure 2 for RLHF and IIA: Perverse Incentives
Figure 3 for RLHF and IIA: Perverse Incentives
Figure 4 for RLHF and IIA: Perverse Incentives
Viaarxiv icon

Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach

Add code
Oct 17, 2023
Viaarxiv icon

Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale

Add code
Mar 20, 2023
Figure 1 for Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Figure 2 for Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Figure 3 for Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Figure 4 for Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale
Viaarxiv icon

Approximate Thompson Sampling via Epistemic Neural Networks

Add code
Feb 18, 2023
Viaarxiv icon

Leveraging Demonstrations to Improve Online Learning: Quality Matters

Add code
Feb 08, 2023
Figure 1 for Leveraging Demonstrations to Improve Online Learning: Quality Matters
Figure 2 for Leveraging Demonstrations to Improve Online Learning: Quality Matters
Figure 3 for Leveraging Demonstrations to Improve Online Learning: Quality Matters
Figure 4 for Leveraging Demonstrations to Improve Online Learning: Quality Matters
Viaarxiv icon

Robustness of Epinets against Distributional Shifts

Add code
Jul 01, 2022
Figure 1 for Robustness of Epinets against Distributional Shifts
Figure 2 for Robustness of Epinets against Distributional Shifts
Figure 3 for Robustness of Epinets against Distributional Shifts
Figure 4 for Robustness of Epinets against Distributional Shifts
Viaarxiv icon