Picture for Mark Dras

Mark Dras

Microsoft Research Institute, Macquarie University

Seeing the Threat: Vulnerabilities in Vision-Language Models to Adversarial Attack

Add code
May 28, 2025
Viaarxiv icon

A Survey on Progress in LLM Alignment from the Perspective of Reward Design

Add code
May 05, 2025
Viaarxiv icon

Bi-directional Model Cascading with Proxy Confidence

Add code
Apr 27, 2025
Viaarxiv icon

Myanmar XNLI: Building a Dataset and Exploring Low-resource Approaches to Natural Language Inference with Myanmar

Add code
Apr 13, 2025
Viaarxiv icon

Defending Deep Neural Networks against Backdoor Attacks via Module Switching

Add code
Apr 08, 2025
Viaarxiv icon

Empirical Calibration and Metric Differential Privacy in Language Models

Add code
Mar 18, 2025
Viaarxiv icon

VaxGuard: A Multi-Generator, Multi-Type, and Multi-Role Dataset for Detecting LLM-Generated Vaccine Misinformation

Add code
Mar 12, 2025
Viaarxiv icon

VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare

Add code
Feb 19, 2025
Viaarxiv icon

Comparing privacy notions for protection against reconstruction attacks in machine learning

Add code
Feb 06, 2025
Viaarxiv icon

Suspiciousness of Adversarial Texts to Human

Add code
Oct 06, 2024
Figure 1 for Suspiciousness of Adversarial Texts to Human
Figure 2 for Suspiciousness of Adversarial Texts to Human
Figure 3 for Suspiciousness of Adversarial Texts to Human
Figure 4 for Suspiciousness of Adversarial Texts to Human
Viaarxiv icon