Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jake Thomas

Text2VLM: Adapting Text-Only Datasets to Evaluate Alignment Training in Visual Language Models

Jul 28, 2025

Gabriel Downer, Sean Craven, Damian Ruck, Jake Thomas

Abstract:The increasing integration of Visual Language Models (VLMs) into AI systems necessitates robust model alignment, especially when handling multimodal content that combines text and images. Existing evaluation datasets heavily lean towards text-only prompts, leaving visual vulnerabilities under evaluated. To address this gap, we propose \textbf{Text2VLM}, a novel multi-stage pipeline that adapts text-only datasets into multimodal formats, specifically designed to evaluate the resilience of VLMs against typographic prompt injection attacks. The Text2VLM pipeline identifies harmful content in the original text and converts it into a typographic image, creating a multimodal prompt for VLMs. Also, our evaluation of open-source VLMs highlights their increased susceptibility to prompt injection when visual inputs are introduced, revealing critical weaknesses in the current models' alignment. This is in addition to a significant performance gap compared to closed-source frontier models. We validate Text2VLM through human evaluations, ensuring the alignment of extracted salient concepts; text summarization and output classification align with human expectations. Text2VLM provides a scalable tool for comprehensive safety assessment, contributing to the development of more robust safety mechanisms for VLMs. By enhancing the evaluation of multimodal vulnerabilities, Text2VLM plays a role in advancing the safe deployment of VLMs in diverse, real-world applications.

* 9 pages, 9 figures. Jake Thomas served as Editor for this manuscript

Via

Access Paper or Ask Questions

Improving Active Learning with a Bayesian Representation of Epistemic Uncertainty

Dec 11, 2024

Jake Thomas, Jeremie Houssineau

Figure 1 for Improving Active Learning with a Bayesian Representation of Epistemic Uncertainty

Figure 2 for Improving Active Learning with a Bayesian Representation of Epistemic Uncertainty

Figure 3 for Improving Active Learning with a Bayesian Representation of Epistemic Uncertainty

Abstract:A popular strategy for active learning is to specifically target a reduction in epistemic uncertainty, since aleatoric uncertainty is often considered as being intrinsic to the system of interest and therefore not reducible. Yet, distinguishing these two types of uncertainty remains challenging and there is no single strategy that consistently outperforms the others. We propose to use a particular combination of probability and possibility theories, with the aim of using the latter to specifically represent epistemic uncertainty, and we show how this combination leads to new active learning strategies that have desirable properties. In order to demonstrate the efficiency of these strategies in non-trivial settings, we introduce the notion of a possibilistic Gaussian process (GP) and consider GP-based multiclass and binary classification problems, for which the proposed methods display a strong performance for both simulated and real datasets.

Via

Access Paper or Ask Questions