Alert button
Picture for Wenpeng Yin

Wenpeng Yin

Alert button

Evaluating LLMs at Detecting Errors in LLM Responses

Add code
Bookmark button
Alert button
Apr 04, 2024
Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala, Salika Dave, Shaobo Qin, Arman Cohan, Wenpeng Yin, Rui Zhang

Viaarxiv icon

X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification

Add code
Bookmark button
Alert button
Mar 06, 2024
Hanzi Xu, Muhao Chen, Lifu Huang, Slobodan Vucetic, Wenpeng Yin

Figure 1 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Figure 2 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Figure 3 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Figure 4 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Viaarxiv icon

FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability

Add code
Bookmark button
Alert button
Feb 28, 2024
Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, Ran Xu, Wenpeng Yin, Caiming Xiong

Viaarxiv icon

Multimodal Instruction Tuning with Conditional Mixture of LoRA

Add code
Bookmark button
Alert button
Feb 24, 2024
Ying Shen, Zhiyang Xu, Qifan Wang, Yu Cheng, Wenpeng Yin, Lifu Huang

Viaarxiv icon

Contrastive Instruction Tuning

Add code
Bookmark button
Alert button
Feb 17, 2024
Tianyi Yan, Fei Wang, James Y. Huang, Wenxuan Zhou, Fan Yin, Aram Galstyan, Wenpeng Yin, Muhao Chen

Viaarxiv icon

Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models

Add code
Bookmark button
Alert button
Feb 16, 2024
Zihao Lin, Mohammad Beigi, Hongxuan Li, Yufan Zhou, Yuxiang Zhang, Qifan Wang, Wenpeng Yin, Lifu Huang

Viaarxiv icon

Large Language Models for Mathematical Reasoning: Progresses and Challenges

Add code
Bookmark button
Alert button
Jan 31, 2024
Janice Ahn, Rishu Verma, Renze Lou, Di Liu, Rui Zhang, Wenpeng Yin

Viaarxiv icon

MT-Ranker: Reference-free machine translation evaluation by inter-system ranking

Add code
Bookmark button
Alert button
Jan 30, 2024
Ibraheem Muhammad Moosa, Rui Zhang, Wenpeng Yin

Viaarxiv icon

GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models

Add code
Bookmark button
Alert button
Dec 11, 2023
Jiaxu Zhao, Meng Fang, Shirui Pan, Wenpeng Yin, Mykola Pechenizkiy

Figure 1 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Figure 2 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Figure 3 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Figure 4 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Viaarxiv icon

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

Add code
Bookmark button
Alert button
Dec 05, 2023
Renze Lou, Kai Zhang, Jian Xie, Yuxuan Sun, Janice Ahn, Hanzi Xu, Yu Su, Wenpeng Yin

Figure 1 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Figure 2 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Figure 3 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Figure 4 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Viaarxiv icon