Alert button
Picture for Hongning Wang

Hongning Wang

Alert button

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

Mar 08, 2024
Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu

Figure 1 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 2 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 3 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Viaarxiv icon

Federated Linear Contextual Bandits with Heterogeneous Clients

Feb 29, 2024
Ethan Blaser, Chuanhao Li, Hongning Wang

Figure 1 for Federated Linear Contextual Bandits with Heterogeneous Clients
Figure 2 for Federated Linear Contextual Bandits with Heterogeneous Clients
Figure 3 for Federated Linear Contextual Bandits with Heterogeneous Clients
Figure 4 for Federated Linear Contextual Bandits with Heterogeneous Clients
Viaarxiv icon

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

Feb 26, 2024
Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang

Viaarxiv icon

Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits

Feb 21, 2024
Zhiwei Wang, Huazheng Wang, Hongning Wang

Viaarxiv icon

Incentivized Truthful Communication for Federated Bandits

Feb 07, 2024
Zhepei Wei, Chuanhao Li, Tianze Ren, Haifeng Xu, Hongning Wang

Viaarxiv icon

Towards Efficient and Exact Optimization of Language Model Alignment

Feb 02, 2024
Haozhe Ji, Cheng Lu, Yilin Niu, Pei Ke, Hongning Wang, Jun Zhu, Jie Tang, Minlie Huang

Viaarxiv icon

AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback

Feb 02, 2024
Jian Guan, Wei Wu, Zujie Wen, Peng Xu, Hongning Wang, Minlie Huang

Viaarxiv icon

The Impact of Snippet Reliability on Misinformation in Online Health Search

Jan 28, 2024
Anat Hashavit, Tamar Stern, Hongning Wang, Sarit Kraus

Viaarxiv icon

AlignBench: Benchmarking Chinese Alignment of Large Language Models

Dec 05, 2023
Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, Jie Tang

Viaarxiv icon