Picture for Alina Oprea

Alina Oprea

Cascading Adversarial Bias from Injection to Distillation in Language Models

Add code
May 30, 2025
Viaarxiv icon

R1dacted: Investigating Local Censorship in DeepSeek's R1 Language Model

Add code
May 19, 2025
Viaarxiv icon

ACE: A Security Architecture for LLM-Integrated App Systems

Add code
Apr 29, 2025
Viaarxiv icon

SAGA: A Security Architecture for Governing AI Agentic Systems

Add code
Apr 27, 2025
Viaarxiv icon

Quantitative Resilience Modeling for Autonomous Cyber Defense

Add code
Mar 04, 2025
Viaarxiv icon

DROP: Poison Dilution via Knowledge Distillation for Federated Learning

Add code
Feb 10, 2025
Viaarxiv icon

Adversarial Inception for Bounded Backdoor Poisoning in Deep Reinforcement Learning

Add code
Oct 21, 2024
Viaarxiv icon

Model-agnostic clean-label backdoor mitigation in cybersecurity environments

Add code
Jul 11, 2024
Figure 1 for Model-agnostic clean-label backdoor mitigation in cybersecurity environments
Figure 2 for Model-agnostic clean-label backdoor mitigation in cybersecurity environments
Figure 3 for Model-agnostic clean-label backdoor mitigation in cybersecurity environments
Figure 4 for Model-agnostic clean-label backdoor mitigation in cybersecurity environments
Viaarxiv icon

Phantom: General Trigger Attacks on Retrieval Augmented Language Generation

Add code
May 30, 2024
Viaarxiv icon

SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents

Add code
May 30, 2024
Viaarxiv icon