Alert button
Picture for Steffi Chern

Steffi Chern

Alert button

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate

Add code
Bookmark button
Alert button
Jan 30, 2024
Steffi Chern, Ethan Chern, Graham Neubig, Pengfei Liu

Viaarxiv icon

Combating Adversarial Attacks with Multi-Agent Debate

Add code
Bookmark button
Alert button
Jan 11, 2024
Steffi Chern, Zhen Fan, Andy Liu

Viaarxiv icon

Align on the Fly: Adapting Chatbot Behavior to Established Norms

Add code
Bookmark button
Alert button
Dec 26, 2023
Chunpu Xu, Steffi Chern, Ethan Chern, Ge Zhang, Zekun Wang, Ruibo Liu, Jing Li, Jie Fu, Pengfei Liu

Viaarxiv icon

FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios

Add code
Bookmark button
Alert button
Jul 26, 2023
I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, Pengfei Liu

Figure 1 for FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Figure 2 for FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Figure 3 for FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Figure 4 for FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
Viaarxiv icon