Alert button
Picture for Xianjun Yang

Xianjun Yang

Alert button

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Add code
Bookmark button
Alert button
Apr 18, 2024
Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse Khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

Viaarxiv icon

Unveiling the Misuse Potential of Base Large Language Models via In-Context Learning

Add code
Bookmark button
Alert button
Apr 16, 2024
Xiao Wang, Tianze Chen, Xianjun Yang, Qi Zhang, Xun Zhao, Dahua Lin

Viaarxiv icon

A Safe Harbor for AI Evaluation and Red Teaming

Add code
Bookmark button
Alert button
Mar 07, 2024
Shayne Longpre, Sayash Kapoor, Kevin Klyman, Ashwin Ramaswami, Rishi Bommasani, Borhane Blili-Hamelin, Yangsibo Huang, Aviya Skowron, Zheng-Xin Yong, Suhas Kotha, Yi Zeng, Weiyan Shi, Xianjun Yang, Reid Southen, Alexander Robey, Patrick Chao, Diyi Yang, Ruoxi Jia, Daniel Kang, Sandy Pentland, Arvind Narayanan, Percy Liang, Peter Henderson

Figure 1 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 2 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 3 for A Safe Harbor for AI Evaluation and Red Teaming
Figure 4 for A Safe Harbor for AI Evaluation and Red Teaming
Viaarxiv icon

A Self-supervised Pressure Map human keypoint Detection Approch: Optimizing Generalization and Computational Efficiency Across Datasets

Add code
Bookmark button
Alert button
Feb 22, 2024
Chengzhang Yu, Xianjun Yang, Wenxia Bao, Shaonan Wang, Zhiming Yao

Viaarxiv icon

Test-Time Backdoor Attacks on Multimodal Large Language Models

Add code
Bookmark button
Alert button
Feb 13, 2024
Dong Lu, Tianyu Pang, Chao Du, Qian Liu, Xianjun Yang, Min Lin

Viaarxiv icon

Weak-to-Strong Jailbreaking on Large Language Models

Add code
Bookmark button
Alert button
Feb 05, 2024
Xuandong Zhao, Xianjun Yang, Tianyu Pang, Chao Du, Lei Li, Yu-Xiang Wang, William Yang Wang

Viaarxiv icon

TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution

Add code
Bookmark button
Alert button
Feb 02, 2024
Wenyue Hua, Xianjun Yang, Zelong Li, Cheng Wei, Yongfeng Zhang

Viaarxiv icon

Navigating the OverKill in Large Language Models

Add code
Bookmark button
Alert button
Jan 31, 2024
Chenyu Shi, Xiao Wang, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Xun Zhao, Dahua Lin

Viaarxiv icon

PLLaMa: An Open-source Large Language Model for Plant Science

Add code
Bookmark button
Alert button
Jan 03, 2024
Xianjun Yang, Junfeng Gao, Wenxin Xue, Erik Alexandersson

Viaarxiv icon