Picture for Benyou Wang

Benyou Wang

LiveClin: A Live Clinical Benchmark without Leakage

Add code
Feb 18, 2026
Viaarxiv icon

Towards Fair and Comprehensive Evaluation of Routers in Collaborative LLM Systems

Add code
Feb 12, 2026
Viaarxiv icon

ClinAlign: Scaling Healthcare Alignment from Clinician Preference

Add code
Feb 11, 2026
Viaarxiv icon

To What Extent Do Token-Level Representations from Pathology Foundation Models Improve Dense Prediction?

Add code
Feb 03, 2026
Viaarxiv icon

Character-R1: Enhancing Role-Aware Reasoning in Role-Playing Agents via RLVR

Add code
Jan 08, 2026
Viaarxiv icon

DentalGPT: Incentivizing Multimodal Complex Reasoning in Dentistry

Add code
Dec 12, 2025
Viaarxiv icon

Human or LLM as Standardized Patients? A Comparative Study for Medical Education

Add code
Nov 12, 2025
Viaarxiv icon

EchoMind: An Interrelated Multi-level Benchmark for Evaluating Empathetic Speech Language Models

Add code
Oct 26, 2025
Viaarxiv icon

Decoding the Ear: A Framework for Objectifying Expressiveness from Human Preference Through Efficient Alignment

Add code
Oct 23, 2025
Viaarxiv icon

Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization

Add code
Sep 11, 2025
Figure 1 for Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization
Figure 2 for Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization
Figure 3 for Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization
Figure 4 for Can Multimodal LLMs See Materials Clearly? A Multimodal Benchmark on Materials Characterization
Viaarxiv icon