Alert button
Picture for Hongning Wang

Hongning Wang

Alert button

ChatGLM-RLHF: Practices of Aligning Large Language Models with Human Feedback

Add code
Bookmark button
Alert button
Apr 03, 2024
Zhenyu Hou, Yilin Niu, Zhengxiao Du, Xiaohan Zhang, Xiao Liu, Aohan Zeng, Qinkai Zheng, Minlie Huang, Hongning Wang, Jie Tang, Yuxiao Dong

Viaarxiv icon

Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation

Add code
Bookmark button
Alert button
Mar 08, 2024
Xiaoying Zhang, Jean-Francois Ton, Wei Shen, Hongning Wang, Yang Liu

Figure 1 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 2 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Figure 3 for Overcoming Reward Overoptimization via Adversarial Policy Optimization with Lightweight Uncertainty Estimation
Viaarxiv icon

Federated Linear Contextual Bandits with Heterogeneous Clients

Add code
Bookmark button
Alert button
Feb 29, 2024
Ethan Blaser, Chuanhao Li, Hongning Wang

Figure 1 for Federated Linear Contextual Bandits with Heterogeneous Clients
Figure 2 for Federated Linear Contextual Bandits with Heterogeneous Clients
Figure 3 for Federated Linear Contextual Bandits with Heterogeneous Clients
Figure 4 for Federated Linear Contextual Bandits with Heterogeneous Clients
Viaarxiv icon

ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors

Add code
Bookmark button
Alert button
Feb 26, 2024
Zhexin Zhang, Yida Lu, Jingyuan Ma, Di Zhang, Rui Li, Pei Ke, Hao Sun, Lei Sha, Zhifang Sui, Hongning Wang, Minlie Huang

Viaarxiv icon

Stealthy Adversarial Attacks on Stochastic Multi-Armed Bandits

Add code
Bookmark button
Alert button
Feb 21, 2024
Zhiwei Wang, Huazheng Wang, Hongning Wang

Viaarxiv icon

Incentivized Truthful Communication for Federated Bandits

Add code
Bookmark button
Alert button
Feb 07, 2024
Zhepei Wei, Chuanhao Li, Tianze Ren, Haifeng Xu, Hongning Wang

Viaarxiv icon

Towards Efficient and Exact Optimization of Language Model Alignment

Add code
Bookmark button
Alert button
Feb 02, 2024
Haozhe Ji, Cheng Lu, Yilin Niu, Pei Ke, Hongning Wang, Jun Zhu, Jie Tang, Minlie Huang

Viaarxiv icon

AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback

Add code
Bookmark button
Alert button
Feb 02, 2024
Jian Guan, Wei Wu, Zujie Wen, Peng Xu, Hongning Wang, Minlie Huang

Viaarxiv icon

The Impact of Snippet Reliability on Misinformation in Online Health Search

Add code
Bookmark button
Alert button
Jan 28, 2024
Anat Hashavit, Tamar Stern, Hongning Wang, Sarit Kraus

Viaarxiv icon