Picture for Weiyan Shi

Weiyan Shi

How to Interpret Agent Behavior

Add code
May 13, 2026
Viaarxiv icon

Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized Execution Trace

Add code
May 11, 2026
Viaarxiv icon

Train Yourself as an LLM: Exploring Effects of AI Literacy on Persuasion via Role-playing LLM Training

Add code
Apr 03, 2026
Viaarxiv icon

Assessing Risks of Large Language Models in Mental Health Support: A Framework for Automated Clinical AI Red Teaming

Add code
Feb 23, 2026
Viaarxiv icon

Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Agents

Add code
Feb 13, 2026
Viaarxiv icon

Open-Source Multimodal Moxin Models with Moxin-VLM and Moxin-VLA

Add code
Dec 22, 2025
Viaarxiv icon

LLMs Encode Harmfulness and Refusal Separately

Add code
Jul 16, 2025
Viaarxiv icon

Sword and Shield: Uses and Strategies of LLMs in Navigating Disinformation

Add code
Jun 08, 2025
Viaarxiv icon

Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects

Add code
Apr 14, 2025
Figure 1 for Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects
Figure 2 for Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects
Figure 3 for Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects
Figure 4 for Labeling Messages as AI-Generated Does Not Reduce Their Persuasive Effects
Viaarxiv icon

Proactive Conversational Agents with Inner Thoughts

Add code
Dec 31, 2024
Viaarxiv icon