Picture for Yerin Hwang

Yerin Hwang

Judging Against the Reference: Uncovering Knowledge-Driven Failures in LLM-Judges on QA Evaluation

Add code
Jan 12, 2026
Viaarxiv icon

Benchmarking LLM Causal Reasoning with Scientifically Validated Relationships

Add code
Oct 08, 2025
Viaarxiv icon

Can You Trick the Grader? Adversarial Persuasion of LLM Judges

Add code
Aug 11, 2025
Viaarxiv icon

Don't Judge Code by Its Cover: Exploring Biases in LLM Judges for Code Evaluation

Add code
May 22, 2025
Viaarxiv icon

Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation

Add code
May 21, 2025
Figure 1 for Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation
Figure 2 for Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation
Figure 3 for Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation
Figure 4 for Fooling the LVLM Judges: Visual Biases in LVLM-Based Evaluation
Viaarxiv icon

LLMs can be easily Confused by Instructional Distractions

Add code
Feb 05, 2025
Figure 1 for LLMs can be easily Confused by Instructional Distractions
Figure 2 for LLMs can be easily Confused by Instructional Distractions
Figure 3 for LLMs can be easily Confused by Instructional Distractions
Figure 4 for LLMs can be easily Confused by Instructional Distractions
Viaarxiv icon

Are LLM-Judges Robust to Expressions of Uncertainty? Investigating the effect of Epistemic Markers on LLM-based Evaluation

Add code
Oct 28, 2024
Viaarxiv icon

SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models

Add code
Oct 25, 2024
Viaarxiv icon

MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs

Add code
Mar 09, 2024
Figure 1 for MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs
Figure 2 for MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs
Figure 3 for MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs
Figure 4 for MP2D: An Automated Topic Shift Dialogue Generation Framework Leveraging Knowledge Graphs
Viaarxiv icon

Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources

Add code
Nov 09, 2023
Figure 1 for Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources
Figure 2 for Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources
Figure 3 for Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources
Figure 4 for Dialogizer: Context-aware Conversational-QA Dataset Generation from Textual Sources
Viaarxiv icon