Picture for Yuxia Wang

Yuxia Wang

Can Machines Resonate with Humans? Evaluating the Emotional and Empathic Comprehension of LMs

Add code
Jun 17, 2024
Viaarxiv icon

Exploring the Potential of Multimodal LLM with Knowledge-Intensive Multimodal ASR

Add code
Jun 16, 2024
Viaarxiv icon

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

Add code
May 09, 2024
Viaarxiv icon

SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection

Add code
Apr 22, 2024
Figure 1 for SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection
Figure 2 for SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection
Figure 3 for SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection
Figure 4 for SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection
Viaarxiv icon

Against The Achilles' Heel: A Survey on Red Teaming for Generative Models

Add code
Mar 31, 2024
Figure 1 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 2 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 3 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Figure 4 for Against The Achilles' Heel: A Survey on Red Teaming for Generative Models
Viaarxiv icon

A Chinese Dataset for Evaluating the Safeguards in Large Language Models

Add code
Feb 19, 2024
Figure 1 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models
Figure 2 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models
Figure 3 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models
Figure 4 for A Chinese Dataset for Evaluating the Safeguards in Large Language Models
Viaarxiv icon

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

Add code
Feb 17, 2024
Figure 1 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
Figure 2 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
Figure 3 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
Figure 4 for M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
Viaarxiv icon

Factuality of Large Language Models in the Year 2024

Add code
Feb 09, 2024
Figure 1 for Factuality of Large Language Models in the Year 2024
Figure 2 for Factuality of Large Language Models in the Year 2024
Figure 3 for Factuality of Large Language Models in the Year 2024
Viaarxiv icon

Understanding the Instruction Mixture for Large Language Model Fine-tuning

Add code
Dec 19, 2023
Figure 1 for Understanding the Instruction Mixture for Large Language Model Fine-tuning
Figure 2 for Understanding the Instruction Mixture for Large Language Model Fine-tuning
Figure 3 for Understanding the Instruction Mixture for Large Language Model Fine-tuning
Figure 4 for Understanding the Instruction Mixture for Large Language Model Fine-tuning
Viaarxiv icon

Factcheck-GPT: End-to-End Fine-Grained Document-Level Fact-Checking and Correction of LLM Output

Add code
Nov 16, 2023
Figure 1 for Factcheck-GPT: End-to-End Fine-Grained Document-Level Fact-Checking and Correction of LLM Output
Figure 2 for Factcheck-GPT: End-to-End Fine-Grained Document-Level Fact-Checking and Correction of LLM Output
Figure 3 for Factcheck-GPT: End-to-End Fine-Grained Document-Level Fact-Checking and Correction of LLM Output
Figure 4 for Factcheck-GPT: End-to-End Fine-Grained Document-Level Fact-Checking and Correction of LLM Output
Viaarxiv icon