Picture for Baihui Zheng

Baihui Zheng

Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models

Add code
May 26, 2025
Viaarxiv icon

Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models

Add code
Dec 23, 2024
Figure 1 for Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
Figure 2 for Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
Figure 3 for Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
Figure 4 for Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models
Viaarxiv icon