Picture for Xinyue Shen

Xinyue Shen

Real Money, Fake Models: Deceptive Model Claims in Shadow APIs

Add code
Mar 05, 2026
Viaarxiv icon

Benchmark of Benchmarks: Unpacking Influence and Code Repository Quality in LLM Safety Benchmarks

Add code
Mar 03, 2026
Viaarxiv icon

Spatiotemporal Calibration for Laser Vision Sensor in Hand-eye System Based on Straight-line Constraint

Add code
Sep 16, 2025
Viaarxiv icon

HateBench: Benchmarking Hate Speech Detectors on LLM-Generated Content and Hate Campaigns

Add code
Jan 28, 2025
Viaarxiv icon

Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media

Add code
Dec 24, 2024
Viaarxiv icon

Voice Jailbreak Attacks Against GPT-4o

Add code
May 29, 2024
Viaarxiv icon

UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images

Add code
May 06, 2024
Figure 1 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 2 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 3 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Figure 4 for UnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated Images
Viaarxiv icon

Comprehensive Assessment of Jailbreak Attacks Against LLMs

Add code
Feb 08, 2024
Figure 1 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Figure 2 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Figure 3 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Figure 4 for Comprehensive Assessment of Jailbreak Attacks Against LLMs
Viaarxiv icon

"Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

Add code
Aug 07, 2023
Figure 1 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Figure 2 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Figure 3 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Figure 4 for "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
Viaarxiv icon

Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models

Add code
May 23, 2023
Figure 1 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Figure 2 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Figure 3 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Figure 4 for Unsafe Diffusion: On the Generation of Unsafe Images and Hateful Memes From Text-To-Image Models
Viaarxiv icon