Alert button
Picture for John Collomosse

John Collomosse

Alert button

TrustMark: Universal Watermarking for Arbitrary Resolution Images

Nov 30, 2023
Tu Bui, Shruti Agarwal, John Collomosse

Imperceptible digital watermarking is important in copyright protection, misinformation prevention, and responsible generative AI. We propose TrustMark - a GAN-based watermarking method with novel design in architecture and spatio-spectra losses to balance the trade-off between watermarked image quality with the watermark recovery accuracy. Our model is trained with robustness in mind, withstanding various in- and out-place perturbations on the encoded image. Additionally, we introduce TrustMark-RM - a watermark remover method useful for re-watermarking. Our methods achieve state-of-art performance on 3 benchmarks comprising arbitrary resolution images.

Viaarxiv icon

DECORAIT -- DECentralized Opt-in/out Registry for AI Training

Sep 25, 2023
Kar Balan, Alex Black, Simon Jenni, Andrew Gilbert, Andy Parsons, John Collomosse

Figure 1 for DECORAIT -- DECentralized Opt-in/out Registry for AI Training
Figure 2 for DECORAIT -- DECentralized Opt-in/out Registry for AI Training
Figure 3 for DECORAIT -- DECentralized Opt-in/out Registry for AI Training
Figure 4 for DECORAIT -- DECentralized Opt-in/out Registry for AI Training

We present DECORAIT; a decentralized registry through which content creators may assert their right to opt in or out of AI training as well as receive reward for their contributions. Generative AI (GenAI) enables images to be synthesized using AI models trained on vast amounts of data scraped from public sources. Model and content creators who may wish to share their work openly without sanctioning its use for training are thus presented with a data governance challenge. Further, establishing the provenance of GenAI training data is important to creatives to ensure fair recognition and reward for their such use. We report a prototype of DECORAIT, which explores hierarchical clustering and a combination of on/off-chain storage to create a scalable decentralized registry to trace the provenance of GenAI training data in order to determine training consent and reward creatives who contribute that data. DECORAIT combines distributed ledger technology (DLT) with visual fingerprinting, leveraging the emerging C2PA (Coalition for Content Provenance and Authenticity) standard to create a secure, open registry through which creatives may express consent and data ownership for GenAI.

* Proc. of the 20th ACM SIGGRAPH European Conference on Visual Media Production 
Viaarxiv icon

DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer

Jul 11, 2023
Dan Ruta, Gemma Canet Tarrés, Andrew Gilbert, Eli Shechtman, Nicholas Kolkin, John Collomosse

Figure 1 for DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer
Figure 2 for DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer
Figure 3 for DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer
Figure 4 for DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer

Neural Style Transfer (NST) is the field of study applying neural techniques to modify the artistic appearance of a content image to match the style of a reference style image. Traditionally, NST methods have focused on texture-based image edits, affecting mostly low level information and keeping most image structures the same. However, style-based deformation of the content is desirable for some styles, especially in cases where the style is abstract or the primary concept of the style is in its deformed rendition of some content. With the recent introduction of diffusion models, such as Stable Diffusion, we can access far more powerful image generation techniques, enabling new possibilities. In our work, we propose using this new class of models to perform style transfer while enabling deformable style transfer, an elusive capability in previous models. We show how leveraging the priors of these models can expose new artistic controls at inference time, and we document our findings in exploring this new direction for the field of style transfer.

Viaarxiv icon

ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer

Apr 12, 2023
Dan Ruta, Gemma Canet Tarres, Alex Black, Andrew Gilbert, John Collomosse

Figure 1 for ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer
Figure 2 for ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer
Figure 3 for ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer
Figure 4 for ALADIN-NST: Self-supervised disentangled representation learning of artistic style through Neural Style Transfer

Representation learning aims to discover individual salient features of a domain in a compact and descriptive form that strongly identifies the unique characteristics of a given sample respective to its domain. Existing works in visual style representation literature have tried to disentangle style from content during training explicitly. A complete separation between these has yet to be fully achieved. Our paper aims to learn a representation of visual artistic style more strongly disentangled from the semantic content depicted in an image. We use Neural Style Transfer (NST) to measure and drive the learning signal and achieve state-of-the-art representation learning on explicitly disentangled metrics. We show that strongly addressing the disentanglement of style and content leads to large gains in style-specific metrics, encoding far less semantic information and achieving state-of-the-art accuracy in downstream multimodal applications.

Viaarxiv icon

NeAT: Neural Artistic Tracing for Beautiful Style Transfer

Apr 11, 2023
Dan Ruta, Andrew Gilbert, John Collomosse, Eli Shechtman, Nicholas Kolkin

Figure 1 for NeAT: Neural Artistic Tracing for Beautiful Style Transfer
Figure 2 for NeAT: Neural Artistic Tracing for Beautiful Style Transfer
Figure 3 for NeAT: Neural Artistic Tracing for Beautiful Style Transfer
Figure 4 for NeAT: Neural Artistic Tracing for Beautiful Style Transfer

Style transfer is the task of reproducing the semantic contents of a source image in the artistic style of a second target image. In this paper, we present NeAT, a new state-of-the art feed-forward style transfer method. We re-formulate feed-forward style transfer as image editing, rather than image generation, resulting in a model which improves over the state-of-the-art in both preserving the source content and matching the target style. An important component of our model's success is identifying and fixing "style halos", a commonly occurring artefact across many style transfer techniques. In addition to training and testing on standard datasets, we introduce the BBST-4M dataset, a new, large scale, high resolution dataset of 4M images. As a component of curating this data, we present a novel model able to classify if an image is stylistic. We use BBST-4M to improve and measure the generalization of NeAT across a huge variety of styles. Not only does NeAT offer state-of-the-art quality and generalization, it is designed and trained for fast inference at high resolution.

Viaarxiv icon

EKILA: Synthetic Media Provenance and Attribution for Generative Art

Apr 10, 2023
Kar Balan, Shruti Agarwal, Simon Jenni, Andy Parsons, Andrew Gilbert, John Collomosse

Figure 1 for EKILA: Synthetic Media Provenance and Attribution for Generative Art
Figure 2 for EKILA: Synthetic Media Provenance and Attribution for Generative Art
Figure 3 for EKILA: Synthetic Media Provenance and Attribution for Generative Art
Figure 4 for EKILA: Synthetic Media Provenance and Attribution for Generative Art

We present EKILA; a decentralized framework that enables creatives to receive recognition and reward for their contributions to generative AI (GenAI). EKILA proposes a robust visual attribution technique and combines this with an emerging content provenance standard (C2PA) to address the problem of synthetic image provenance -- determining the generative model and training data responsible for an AI-generated image. Furthermore, EKILA extends the non-fungible token (NFT) ecosystem to introduce a tokenized representation for rights, enabling a triangular relationship between the asset's Ownership, Rights, and Attribution (ORA). Leveraging the ORA relationship enables creators to express agency over training consent and, through our attribution model, to receive apportioned credit, including royalty payments for the use of their assets in GenAI.

* Proc. CVPR Workshop on Media Forensics 2023 
Viaarxiv icon

RoSteALS: Robust Steganography using Autoencoder Latent Space

Apr 06, 2023
Tu Bui, Shruti Agarwal, Ning Yu, John Collomosse

Figure 1 for RoSteALS: Robust Steganography using Autoencoder Latent Space
Figure 2 for RoSteALS: Robust Steganography using Autoencoder Latent Space
Figure 3 for RoSteALS: Robust Steganography using Autoencoder Latent Space
Figure 4 for RoSteALS: Robust Steganography using Autoencoder Latent Space

Data hiding such as steganography and invisible watermarking has important applications in copyright protection, privacy-preserved communication and content provenance. Existing works often fall short in either preserving image quality, or robustness against perturbations or are too complex to train. We propose RoSteALS, a practical steganography technique leveraging frozen pretrained autoencoders to free the payload embedding from learning the distribution of cover images. RoSteALS has a light-weight secret encoder of just 300k parameters, is easy to train, has perfect secret recovery performance and comparable image quality on three benchmarks. Additionally, RoSteALS can be adapted for novel cover-less steganography applications in which the cover image can be sampled from noise or conditioned on text prompts via a denoising diffusion process. Our model and code are available at \url{https://github.com/TuBui/RoSteALS}.

* accepted to CVPR WMF 2023 
Viaarxiv icon

PARASOL: Parametric Style Control for Diffusion Image Synthesis

Mar 27, 2023
Gemma Canet Tarrés, Dan Ruta, Tu Bui, John Collomosse

Figure 1 for PARASOL: Parametric Style Control for Diffusion Image Synthesis
Figure 2 for PARASOL: Parametric Style Control for Diffusion Image Synthesis
Figure 3 for PARASOL: Parametric Style Control for Diffusion Image Synthesis
Figure 4 for PARASOL: Parametric Style Control for Diffusion Image Synthesis

We propose PARASOL, a multi-modal synthesis model that enables disentangled, parametric control of the visual style of the image by jointly conditioning synthesis on both content and a fine-grained visual style embedding. We train a latent diffusion model (LDM) using specific losses for each modality and adapt the classifier-free guidance for encouraging disentangled control over independent content and style modalities at inference time. We leverage auxiliary semantic and style-based search to create training triplets for supervision of the LDM, ensuring complementarity of content and style cues. PARASOL shows promise for enabling nuanced control over visual style in diffusion models for image creation and stylization, as well as generative search where text-based search results may be adapted to more closely match user intent by interpolating both content and style descriptors.

* Added Appendix 
Viaarxiv icon

VADER: Video Alignment Differencing and Retrieval

Mar 25, 2023
Alexander Black, Simon Jenni, Tu Bui, Md. Mehrab Tanjim, Stefano Petrangeli, Ritwik Sinha, Viswanathan Swaminathan, John Collomosse

Figure 1 for VADER: Video Alignment Differencing and Retrieval
Figure 2 for VADER: Video Alignment Differencing and Retrieval
Figure 3 for VADER: Video Alignment Differencing and Retrieval
Figure 4 for VADER: Video Alignment Differencing and Retrieval

We propose VADER, a spatio-temporal matching, alignment, and change summarization method to help fight misinformation spread via manipulated videos. VADER matches and coarsely aligns partial video fragments to candidate videos using a robust visual descriptor and scalable search over adaptively chunked video content. A transformer-based alignment module then refines the temporal localization of the query fragment within the matched video. A space-time comparator module identifies regions of manipulation between aligned content, invariant to any changes due to any residual temporal misalignments or artifacts arising from non-editorial changes of the content. Robustly matching video to a trusted source enables conclusions to be drawn on video provenance, enabling informed trust decisions on content encountered.

Viaarxiv icon