Data Poisoning


Data poisoning is the process of manipulating training data to compromise the performance of machine learning models.

LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users

Add code
Jul 03, 2025
Viaarxiv icon

School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs

Add code
Aug 24, 2025
Figure 1 for School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs
Figure 2 for School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs
Figure 3 for School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs
Figure 4 for School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs
Viaarxiv icon

Quantifying Conversation Drift in MCP via Latent Polytope

Add code
Aug 08, 2025
Viaarxiv icon

A Linear Approach to Data Poisoning

Add code
May 21, 2025
Viaarxiv icon

ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation

Add code
Sep 09, 2025
Figure 1 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Figure 2 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Figure 3 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Figure 4 for ImportSnare: Directed "Code Manual" Hijacking in Retrieval-Augmented Code Generation
Viaarxiv icon

Sybil-based Virtual Data Poisoning Attacks in Federated Learning

Add code
May 15, 2025
Figure 1 for Sybil-based Virtual Data Poisoning Attacks in Federated Learning
Figure 2 for Sybil-based Virtual Data Poisoning Attacks in Federated Learning
Figure 3 for Sybil-based Virtual Data Poisoning Attacks in Federated Learning
Figure 4 for Sybil-based Virtual Data Poisoning Attacks in Federated Learning
Viaarxiv icon

Securing Traffic Sign Recognition Systems in Autonomous Vehicles

Add code
Jun 06, 2025
Viaarxiv icon

Collapsing Sequence-Level Data-Policy Coverage via Poisoning Attack in Offline Reinforcement Learning

Add code
Jun 12, 2025
Viaarxiv icon

Data Shifts Hurt CoT: A Theoretical Study

Add code
Jun 12, 2025
Figure 1 for Data Shifts Hurt CoT: A Theoretical Study
Figure 2 for Data Shifts Hurt CoT: A Theoretical Study
Figure 3 for Data Shifts Hurt CoT: A Theoretical Study
Figure 4 for Data Shifts Hurt CoT: A Theoretical Study
Viaarxiv icon

Can In-Context Reinforcement Learning Recover From Reward Poisoning Attacks?

Add code
Jun 07, 2025
Viaarxiv icon