Picture for Chuxue Cao

Chuxue Cao

MedInsightBench: Evaluating Medical Analytics Agents Through Multi-Step Insight Discovery in Multimodal Medical Data

Add code
Dec 15, 2025
Viaarxiv icon

SafeLawBench: Towards Safe Alignment of Large Language Models

Add code
Jun 07, 2025
Figure 1 for SafeLawBench: Towards Safe Alignment of Large Language Models
Figure 2 for SafeLawBench: Towards Safe Alignment of Large Language Models
Figure 3 for SafeLawBench: Towards Safe Alignment of Large Language Models
Figure 4 for SafeLawBench: Towards Safe Alignment of Large Language Models
Viaarxiv icon

Measuring Hong Kong Massive Multi-Task Language Understanding

Add code
May 04, 2025
Viaarxiv icon