Alert button
Picture for Sahar Abdelnabi

Sahar Abdelnabi

Alert button

Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

Add code
Bookmark button
Alert button
Mar 11, 2024
Egor Zverev, Sahar Abdelnabi, Mario Fritz, Christoph H. Lampert

Figure 1 for Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Figure 2 for Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Figure 3 for Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Figure 4 for Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Viaarxiv icon

Exploring Value Biases: How LLMs Deviate Towards the Ideal

Add code
Bookmark button
Alert button
Feb 21, 2024
Sarath Sivaprasad, Pramod Kaushik, Sahar Abdelnabi, Mario Fritz

Viaarxiv icon

LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games

Add code
Bookmark button
Alert button
Sep 29, 2023
Sahar Abdelnabi, Amr Gomaa, Sarath Sivaprasad, Lea Schönherr, Mario Fritz

Figure 1 for LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games
Figure 2 for LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games
Figure 3 for LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games
Figure 4 for LLM-Deliberation: Evaluating LLMs with Interactive Multi-Agent Negotiation Games
Viaarxiv icon

More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models

Add code
Bookmark button
Alert button
Feb 23, 2023
Kai Greshake, Sahar Abdelnabi, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz

Figure 1 for More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models
Figure 2 for More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models
Figure 3 for More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models
Figure 4 for More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models
Viaarxiv icon

Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems

Add code
Bookmark button
Alert button
Sep 07, 2022
Sahar Abdelnabi, Mario Fritz

Figure 1 for Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems
Figure 2 for Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems
Figure 3 for Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems
Figure 4 for Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems
Viaarxiv icon

Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources

Add code
Bookmark button
Alert button
Dec 07, 2021
Sahar Abdelnabi, Rakibul Hasan, Mario Fritz

Figure 1 for Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
Figure 2 for Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
Figure 3 for Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
Figure 4 for Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources
Viaarxiv icon

"What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models

Add code
Bookmark button
Alert button
Mar 09, 2021
Sahar Abdelnabi, Mario Fritz

Figure 1 for "What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models
Figure 2 for "What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models
Figure 3 for "What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models
Figure 4 for "What's in the box?!": Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models
Viaarxiv icon

Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding

Add code
Bookmark button
Alert button
Sep 07, 2020
Sahar Abdelnabi, Mario Fritz

Figure 1 for Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
Figure 2 for Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
Figure 3 for Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
Figure 4 for Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
Viaarxiv icon