Picture for Wenpeng Yin

Wenpeng Yin

Evaluating LLMs at Detecting Errors in LLM Responses

Add code
Apr 04, 2024
Figure 1 for Evaluating LLMs at Detecting Errors in LLM Responses
Figure 2 for Evaluating LLMs at Detecting Errors in LLM Responses
Figure 3 for Evaluating LLMs at Detecting Errors in LLM Responses
Figure 4 for Evaluating LLMs at Detecting Errors in LLM Responses
Viaarxiv icon

X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification

Add code
Mar 06, 2024
Figure 1 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Figure 2 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Figure 3 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Figure 4 for X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification
Viaarxiv icon

FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability

Add code
Feb 28, 2024
Viaarxiv icon

Multimodal Instruction Tuning with Conditional Mixture of LoRA

Add code
Feb 24, 2024
Viaarxiv icon

Contrastive Instruction Tuning

Add code
Feb 17, 2024
Figure 1 for Contrastive Instruction Tuning
Figure 2 for Contrastive Instruction Tuning
Figure 3 for Contrastive Instruction Tuning
Figure 4 for Contrastive Instruction Tuning
Viaarxiv icon

Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models

Add code
Feb 16, 2024
Figure 1 for Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Figure 2 for Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Figure 3 for Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Figure 4 for Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Viaarxiv icon

Large Language Models for Mathematical Reasoning: Progresses and Challenges

Add code
Jan 31, 2024
Figure 1 for Large Language Models for Mathematical Reasoning: Progresses and Challenges
Figure 2 for Large Language Models for Mathematical Reasoning: Progresses and Challenges
Figure 3 for Large Language Models for Mathematical Reasoning: Progresses and Challenges
Viaarxiv icon

MT-Ranker: Reference-free machine translation evaluation by inter-system ranking

Add code
Jan 30, 2024
Figure 1 for MT-Ranker: Reference-free machine translation evaluation by inter-system ranking
Figure 2 for MT-Ranker: Reference-free machine translation evaluation by inter-system ranking
Figure 3 for MT-Ranker: Reference-free machine translation evaluation by inter-system ranking
Figure 4 for MT-Ranker: Reference-free machine translation evaluation by inter-system ranking
Viaarxiv icon

GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models

Add code
Dec 11, 2023
Figure 1 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Figure 2 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Figure 3 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Figure 4 for GPTBIAS: A Comprehensive Framework for Evaluating Bias in Large Language Models
Viaarxiv icon

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

Add code
Dec 05, 2023
Figure 1 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Figure 2 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Figure 3 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Figure 4 for MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Viaarxiv icon