Picture for Xiangliang Zhang

Xiangliang Zhang

KAUST, Saudi Arabia

Dissecting Logical Reasoning in LLMs: A Fine-Grained Evaluation and Supervision Study

Add code
Jun 05, 2025
Viaarxiv icon

Seeing the Invisible: Machine learning-Based QPI Kernel Extraction via Latent Alignment

Add code
Jun 05, 2025
Viaarxiv icon

SocialMaze: A Benchmark for Evaluating Social Reasoning in Large Language Models

Add code
May 29, 2025
Viaarxiv icon

Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models

Add code
May 24, 2025
Viaarxiv icon

AdaReasoner: Adaptive Reasoning Enables More Flexible Thinking

Add code
May 22, 2025
Viaarxiv icon

A Personalized Conversational Benchmark: Towards Simulating Personalized Conversations

Add code
May 20, 2025
Viaarxiv icon

Invisible Entropy: Towards Safe and Efficient Low-Entropy LLM Watermarking

Add code
May 20, 2025
Viaarxiv icon

Evaluating and Mitigating Bias in AI-Based Medical Text Generation

Add code
Apr 24, 2025
Viaarxiv icon

Beyond Single-Value Metrics: Evaluating and Enhancing LLM Unlearning with Cognitive Diagnosis

Add code
Feb 19, 2025
Viaarxiv icon

Breaking Focus: Contextual Distraction Curse in Large Language Models

Add code
Feb 03, 2025
Figure 1 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Figure 2 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Figure 3 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Figure 4 for Breaking Focus: Contextual Distraction Curse in Large Language Models
Viaarxiv icon