Picture for Zhuohan Xie

Zhuohan Xie

VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration

Add code
May 26, 2025
Viaarxiv icon

LLM-BABYBENCH: Understanding and Evaluating Grounded Planning and Reasoning in LLMs

Add code
May 17, 2025
Viaarxiv icon

A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs

Add code
May 13, 2025
Viaarxiv icon

Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh

Add code
Mar 03, 2025
Viaarxiv icon

Entity Framing and Role Portrayal in the News

Add code
Feb 20, 2025
Viaarxiv icon

KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan

Add code
Feb 18, 2025
Viaarxiv icon

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

Add code
Feb 17, 2025
Viaarxiv icon

GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human

Add code
Jan 19, 2025
Viaarxiv icon

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection

Add code
Aug 08, 2024
Figure 1 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Figure 2 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Figure 3 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Figure 4 for LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Viaarxiv icon

DeltaScore: Evaluating Story Generation with Differentiating Perturbations

Add code
Mar 15, 2023
Viaarxiv icon