Alert button
Picture for Zhiheng Xi

Zhiheng Xi

Alert button

Self-Demos: Eliciting Out-of-Demonstration Generalizability in Large Language Models

Add code
Bookmark button
Alert button
Apr 01, 2024
Wei He, Shichun Liu, Jun Zhao, Yiwen Ding, Yi Lu, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang

Viaarxiv icon

Subspace Defense: Discarding Adversarial Perturbations by Learning a Subspace for Clean Signals

Add code
Bookmark button
Alert button
Mar 24, 2024
Rui Zheng, Yuhao Zhou, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang

Viaarxiv icon

EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models

Add code
Bookmark button
Alert button
Mar 18, 2024
Weikang Zhou, Xiao Wang, Limao Xiong, Han Xia, Yingshuang Gu, Mingxu Chai, Fukang Zhu, Caishuang Huang, Shihan Dou, Zhiheng Xi, Rui Zheng, Songyang Gao, Yicheng Zou, Hang Yan, Yifan Le, Ruohui Wang, Lijun Li, Jing Shao, Tao Gui, Qi Zhang, Xuanjing Huang

Figure 1 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Figure 2 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Figure 3 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Figure 4 for EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models
Viaarxiv icon

RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions

Add code
Bookmark button
Alert button
Feb 26, 2024
Yuansen Zhang, Xiao Wang, Zhiheng Xi, Han Xia, Tao Gui, Qi Zhang, Xuanjing Huang

Viaarxiv icon

Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning

Add code
Bookmark button
Alert button
Feb 08, 2024
Zhiheng Xi, Wenxiang Chen, Boyang Hong, Senjie Jin, Rui Zheng, Wei He, Yiwen Ding, Shichun Liu, Xin Guo, Junzhe Wang, Honglin Guo, Wei Shen, Xiaoran Fan, Yuhao Zhou, Shihan Dou, Xiao Wang, Xinbo Zhang, Peng Sun, Tao Gui, Qi Zhang, Xuanjing Huang

Viaarxiv icon

StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback

Add code
Bookmark button
Alert button
Feb 05, 2024
Shihan Dou, Yan Liu, Haoxiang Jia, Limao Xiong, Enyu Zhou, Wei Shen, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Xuanjing Huang, Tao Gui

Viaarxiv icon

MouSi: Poly-Visual-Expert Vision-Language Models

Add code
Bookmark button
Alert button
Jan 30, 2024
Xiaoran Fan, Tao Ji, Changhao Jiang, Shuo Li, Senjie Jin, Sirui Song, Junke Wang, Boyang Hong, Lu Chen, Guodong Zheng, Ming Zhang, Caishuang Huang, Rui Zheng, Zhiheng Xi, Yuhao Zhou, Shihan Dou, Junjie Ye, Hang Yan, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

Viaarxiv icon

Secrets of RLHF in Large Language Models Part II: Reward Modeling

Add code
Bookmark button
Alert button
Jan 12, 2024
Binghai Wang, Rui Zheng, Lu Chen, Yan Liu, Shihan Dou, Caishuang Huang, Wei Shen, Senjie Jin, Enyu Zhou, Chenyu Shi, Songyang Gao, Nuo Xu, Yuhao Zhou, Xiaoran Fan, Zhiheng Xi, Jun Zhao, Xiao Wang, Tao Ji, Hang Yan, Lixing Shen, Zhan Chen, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang

Viaarxiv icon

LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment

Add code
Bookmark button
Alert button
Dec 18, 2023
Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Jun Zhao, Wei Shen, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang

Viaarxiv icon

Improving Generalization of Alignment with Human Preferences through Group Invariant Learning

Add code
Bookmark button
Alert button
Oct 19, 2023
Rui Zheng, Wei Shen, Yuan Hua, Wenbin Lai, Shihan Dou, Yuhao Zhou, Zhiheng Xi, Xiao Wang, Haoran Huang, Tao Gui, Qi Zhang, Xuanjing Huang

Figure 1 for Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
Figure 2 for Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
Figure 3 for Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
Figure 4 for Improving Generalization of Alignment with Human Preferences through Group Invariant Learning
Viaarxiv icon