Picture for Xiuying Chen

Xiuying Chen

SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models

Add code
May 29, 2025
Viaarxiv icon

CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data Synthesis

Add code
May 26, 2025
Viaarxiv icon

VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration

Add code
May 26, 2025
Viaarxiv icon

Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models

Add code
May 24, 2025
Viaarxiv icon

Divide-Fuse-Conquer: Eliciting "Aha Moments" in Multi-Scenario Games

Add code
May 22, 2025
Viaarxiv icon

ManipLVM-R1: Reinforcement Learning for Reasoning in Embodied Manipulation with Large Vision-Language Models

Add code
May 22, 2025
Viaarxiv icon

Evaluate Bias without Manual Test Sets: A Concept Representation Perspective for LLMs

Add code
May 21, 2025
Viaarxiv icon

Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models

Add code
May 21, 2025
Viaarxiv icon

Invisible Entropy: Towards Safe and Efficient Low-Entropy LLM Watermarking

Add code
May 20, 2025
Viaarxiv icon

Unify Graph Learning with Text: Unleashing LLM Potentials for Session Search

Add code
May 20, 2025
Viaarxiv icon