Picture for Kai Shu

Kai Shu

Measuring Sycophancy of Language Models in Multi-turn Dialogues

Add code
May 28, 2025
Viaarxiv icon

SACM: SEEG-Audio Contrastive Matching for Chinese Speech Decoding

Add code
May 26, 2025
Viaarxiv icon

Can Multimodal LLMs Perform Time Series Anomaly Detection?

Add code
Feb 25, 2025
Viaarxiv icon

Benchmarking LLMs for Political Science: A United Nations Perspective

Add code
Feb 19, 2025
Viaarxiv icon

Understanding and Tackling Label Errors in Individual-Level Nature Language Understanding

Add code
Feb 18, 2025
Viaarxiv icon

Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection

Add code
Dec 07, 2024
Figure 1 for Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection
Figure 2 for Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection
Figure 3 for Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection
Figure 4 for Graph with Sequence: Broad-Range Semantic Modeling for Fake News Detection
Viaarxiv icon

ConQRet: Benchmarking Fine-Grained Evaluation of Retrieval Augmented Argumentation with LLM Judges

Add code
Dec 06, 2024
Viaarxiv icon

From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge

Add code
Nov 25, 2024
Figure 1 for From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Figure 2 for From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Figure 3 for From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Figure 4 for From Generation to Judgment: Opportunities and Challenges of LLM-as-a-judge
Viaarxiv icon

Piecing It All Together: Verifying Multi-Hop Multimodal Claims

Add code
Nov 14, 2024
Figure 1 for Piecing It All Together: Verifying Multi-Hop Multimodal Claims
Figure 2 for Piecing It All Together: Verifying Multi-Hop Multimodal Claims
Figure 3 for Piecing It All Together: Verifying Multi-Hop Multimodal Claims
Figure 4 for Piecing It All Together: Verifying Multi-Hop Multimodal Claims
Viaarxiv icon

ClinicalBench: Can LLMs Beat Traditional ML Models in Clinical Prediction?

Add code
Nov 10, 2024
Viaarxiv icon