Picture for Xuezhi Cao

Xuezhi Cao

Making Mathematical Reasoning Adaptive

Add code
Oct 06, 2025
Viaarxiv icon

MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models

Add code
Sep 18, 2025
Viaarxiv icon

Instance-level Randomization: Toward More Stable LLM Evaluations

Add code
Sep 16, 2025
Viaarxiv icon

HKD4VLM: A Progressive Hybrid Knowledge Distillation Framework for Robust Multimodal Hallucination and Factuality Detection in VLMs

Add code
Jun 16, 2025
Viaarxiv icon

OIBench: Benchmarking Strong Reasoning Models with Olympiad in Informatics

Add code
Jun 12, 2025
Viaarxiv icon

NTIRE 2025 challenge on Text to Image Generation Model Quality Assessment

Add code
May 22, 2025
Viaarxiv icon

ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations

Add code
May 20, 2025
Viaarxiv icon

Why Not Act on What You Know? Unleashing Safety Potential of LLMs via Self-Aware Guard Enhancement

Add code
May 17, 2025
Viaarxiv icon

Audio Turing Test: Benchmarking the Human-likeness of Large Language Model-based Text-to-Speech Systems in Chinese

Add code
May 16, 2025
Viaarxiv icon

TokenFocus-VQA: Enhancing Text-to-Image Alignment with Position-Aware Focus and Multi-Perspective Aggregations on LVLMs

Add code
Apr 10, 2025
Viaarxiv icon