Picture for Shiyao Cui

Shiyao Cui

Exploring Multimodal Challenges in Toxic Chinese Detection: Taxonomy, Benchmark, and Findings

Add code
May 30, 2025
Viaarxiv icon

How Should We Enhance the Safety of Large Reasoning Models: An Empirical Study

Add code
May 21, 2025
Viaarxiv icon

Be Careful When Fine-tuning On Open-Source LLMs: Your Fine-tuning Data Could Be Secretly Stolen!

Add code
May 21, 2025
Viaarxiv icon

ShieldVLM: Safeguarding the Multimodal Implicit Toxicity via Deliberative Reasoning with LVLMs

Add code
May 20, 2025
Viaarxiv icon

LongSafety: Evaluating Long-Context Safety of Large Language Models

Add code
Feb 24, 2025
Viaarxiv icon

AISafetyLab: A Comprehensive Framework for AI Safety Evaluation and Improvement

Add code
Feb 24, 2025
Viaarxiv icon

Human Decision-making is Susceptible to AI-driven Manipulation

Add code
Feb 11, 2025
Viaarxiv icon

Agent-SafetyBench: Evaluating the Safety of LLM Agents

Add code
Dec 19, 2024
Viaarxiv icon

The Superalignment of Superhuman Intelligence with Large Language Models

Add code
Dec 15, 2024
Figure 1 for The Superalignment of Superhuman Intelligence with Large Language Models
Viaarxiv icon

Global Challenge for Safe and Secure LLMs Track 1

Add code
Nov 21, 2024
Figure 1 for Global Challenge for Safe and Secure LLMs Track 1
Figure 2 for Global Challenge for Safe and Secure LLMs Track 1
Figure 3 for Global Challenge for Safe and Secure LLMs Track 1
Figure 4 for Global Challenge for Safe and Secure LLMs Track 1
Viaarxiv icon