Picture for Yu'an Yang

Yu'an Yang

ProText: A benchmark dataset for measuring (mis)gendering in long-form texts

Add code
Mar 29, 2026
Viaarxiv icon

TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Add code
Feb 20, 2024
Figure 1 for TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Figure 2 for TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Figure 3 for TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Figure 4 for TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization
Viaarxiv icon