Picture for Yingchun Wang

Yingchun Wang

Towards Context-Invariant Safety Alignment for Large Language Models

Add code
May 20, 2026
Viaarxiv icon

Safactory: A Scalable Agent Factory for Trustworthy Autonomous Intelligence

Add code
May 07, 2026
Viaarxiv icon

Mechanistic Origin of Moral Indifference in Language Models

Add code
Mar 16, 2026
Viaarxiv icon

From Sparse Decisions to Dense Reasoning: A Multi-attribute Trajectory Paradigm for Multimodal Moderation

Add code
Jan 28, 2026
Viaarxiv icon

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

Add code
Jan 04, 2026
Viaarxiv icon

UniMark: Artificial Intelligence Generated Content Identification Toolkit

Add code
Dec 13, 2025
Viaarxiv icon

Evolve the Method, Not the Prompts: Evolutionary Synthesis of Jailbreak Attacks on LLMs

Add code
Nov 16, 2025
Viaarxiv icon

Beyond Correctness: Confidence-Aware Reward Modeling for Enhancing Large Language Model Reasoning

Add code
Nov 09, 2025
Viaarxiv icon

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Add code
Oct 02, 2025
Viaarxiv icon

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Add code
Jul 24, 2025
Figure 1 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 2 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 3 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 4 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Viaarxiv icon