Picture for Ali Naseh

Ali Naseh

R1dacted: Investigating Local Censorship in DeepSeek's R1 Language Model

Add code
May 19, 2025
Viaarxiv icon

LLM Misalignment via Adversarial RLHF Platforms

Add code
Mar 04, 2025
Figure 1 for LLM Misalignment via Adversarial RLHF Platforms
Figure 2 for LLM Misalignment via Adversarial RLHF Platforms
Figure 3 for LLM Misalignment via Adversarial RLHF Platforms
Figure 4 for LLM Misalignment via Adversarial RLHF Platforms
Viaarxiv icon

OverThink: Slowdown Attacks on Reasoning LLMs

Add code
Feb 05, 2025
Figure 1 for OverThink: Slowdown Attacks on Reasoning LLMs
Figure 2 for OverThink: Slowdown Attacks on Reasoning LLMs
Figure 3 for OverThink: Slowdown Attacks on Reasoning LLMs
Figure 4 for OverThink: Slowdown Attacks on Reasoning LLMs
Viaarxiv icon

OVERTHINKING: Slowdown Attacks on Reasoning LLMs

Add code
Feb 04, 2025
Figure 1 for OVERTHINKING: Slowdown Attacks on Reasoning LLMs
Figure 2 for OVERTHINKING: Slowdown Attacks on Reasoning LLMs
Figure 3 for OVERTHINKING: Slowdown Attacks on Reasoning LLMs
Figure 4 for OVERTHINKING: Slowdown Attacks on Reasoning LLMs
Viaarxiv icon

Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection

Add code
Jan 20, 2025
Figure 1 for Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection
Figure 2 for Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection
Viaarxiv icon

Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors

Add code
Jun 21, 2024
Figure 1 for Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors
Figure 2 for Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors
Figure 3 for Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors
Figure 4 for Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors
Viaarxiv icon

Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images

Add code
Apr 21, 2024
Figure 1 for Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Figure 2 for Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Figure 3 for Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Figure 4 for Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images
Viaarxiv icon

Diffence: Fencing Membership Privacy With Diffusion Models

Add code
Dec 07, 2023
Viaarxiv icon

Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication

Add code
Dec 06, 2023
Viaarxiv icon

Understanding (Un)Intended Memorization in Text-to-Image Generative Models

Add code
Dec 06, 2023
Viaarxiv icon