Picture for Wenhan Dong

Wenhan Dong

Evaluation Hallucination in Multi-Round Incomplete Information Lateral-Driven Reasoning Tasks

Add code
May 28, 2025
Viaarxiv icon

JALMBench: Benchmarking Jailbreak Vulnerabilities in Audio Language Models

Add code
May 23, 2025
Viaarxiv icon

Humanizing LLMs: A Survey of Psychological Measurements with Tools, Datasets, and Human-Agent Applications

Add code
Apr 30, 2025
Viaarxiv icon

Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models

Add code
Apr 18, 2025
Viaarxiv icon