Alert button
Picture for Tu Bui

Tu Bui

Alert button

TrustMark: Universal Watermarking for Arbitrary Resolution Images

Nov 30, 2023
Tu Bui, Shruti Agarwal, John Collomosse

Imperceptible digital watermarking is important in copyright protection, misinformation prevention, and responsible generative AI. We propose TrustMark - a GAN-based watermarking method with novel design in architecture and spatio-spectra losses to balance the trade-off between watermarked image quality with the watermark recovery accuracy. Our model is trained with robustness in mind, withstanding various in- and out-place perturbations on the encoded image. Additionally, we introduce TrustMark-RM - a watermark remover method useful for re-watermarking. Our methods achieve state-of-art performance on 3 benchmarks comprising arbitrary resolution images.

Viaarxiv icon

RoSteALS: Robust Steganography using Autoencoder Latent Space

Apr 06, 2023
Tu Bui, Shruti Agarwal, Ning Yu, John Collomosse

Figure 1 for RoSteALS: Robust Steganography using Autoencoder Latent Space
Figure 2 for RoSteALS: Robust Steganography using Autoencoder Latent Space
Figure 3 for RoSteALS: Robust Steganography using Autoencoder Latent Space
Figure 4 for RoSteALS: Robust Steganography using Autoencoder Latent Space

Data hiding such as steganography and invisible watermarking has important applications in copyright protection, privacy-preserved communication and content provenance. Existing works often fall short in either preserving image quality, or robustness against perturbations or are too complex to train. We propose RoSteALS, a practical steganography technique leveraging frozen pretrained autoencoders to free the payload embedding from learning the distribution of cover images. RoSteALS has a light-weight secret encoder of just 300k parameters, is easy to train, has perfect secret recovery performance and comparable image quality on three benchmarks. Additionally, RoSteALS can be adapted for novel cover-less steganography applications in which the cover image can be sampled from noise or conditioned on text prompts via a denoising diffusion process. Our model and code are available at \url{https://github.com/TuBui/RoSteALS}.

* accepted to CVPR WMF 2023 
Viaarxiv icon

PARASOL: Parametric Style Control for Diffusion Image Synthesis

Mar 27, 2023
Gemma Canet Tarrés, Dan Ruta, Tu Bui, John Collomosse

Figure 1 for PARASOL: Parametric Style Control for Diffusion Image Synthesis
Figure 2 for PARASOL: Parametric Style Control for Diffusion Image Synthesis
Figure 3 for PARASOL: Parametric Style Control for Diffusion Image Synthesis
Figure 4 for PARASOL: Parametric Style Control for Diffusion Image Synthesis

We propose PARASOL, a multi-modal synthesis model that enables disentangled, parametric control of the visual style of the image by jointly conditioning synthesis on both content and a fine-grained visual style embedding. We train a latent diffusion model (LDM) using specific losses for each modality and adapt the classifier-free guidance for encouraging disentangled control over independent content and style modalities at inference time. We leverage auxiliary semantic and style-based search to create training triplets for supervision of the LDM, ensuring complementarity of content and style cues. PARASOL shows promise for enabling nuanced control over visual style in diffusion models for image creation and stylization, as well as generative search where text-based search results may be adapted to more closely match user intent by interpolating both content and style descriptors.

* Added Appendix 
Viaarxiv icon

VADER: Video Alignment Differencing and Retrieval

Mar 25, 2023
Alexander Black, Simon Jenni, Tu Bui, Md. Mehrab Tanjim, Stefano Petrangeli, Ritwik Sinha, Viswanathan Swaminathan, John Collomosse

Figure 1 for VADER: Video Alignment Differencing and Retrieval
Figure 2 for VADER: Video Alignment Differencing and Retrieval
Figure 3 for VADER: Video Alignment Differencing and Retrieval
Figure 4 for VADER: Video Alignment Differencing and Retrieval

We propose VADER, a spatio-temporal matching, alignment, and change summarization method to help fight misinformation spread via manipulated videos. VADER matches and coarsely aligns partial video fragments to candidate videos using a robust visual descriptor and scalable search over adaptively chunked video content. A transformer-based alignment module then refines the temporal localization of the query fragment within the matched video. A space-time comparator module identifies regions of manipulation between aligned content, invariant to any changes due to any residual temporal misalignments or artifacts arising from non-editorial changes of the content. Robustly matching video to a trusted source enables conclusions to be drawn on video provenance, enabling informed trust decisions on content encountered.

Viaarxiv icon

RepMix: Representation Mixing for Robust Attribution of Synthesized Images

Jul 12, 2022
Tu Bui, Ning Yu, John Collomosse

Figure 1 for RepMix: Representation Mixing for Robust Attribution of Synthesized Images
Figure 2 for RepMix: Representation Mixing for Robust Attribution of Synthesized Images
Figure 3 for RepMix: Representation Mixing for Robust Attribution of Synthesized Images
Figure 4 for RepMix: Representation Mixing for Robust Attribution of Synthesized Images

Rapid advances in Generative Adversarial Networks (GANs) raise new challenges for image attribution; detecting whether an image is synthetic and, if so, determining which GAN architecture created it. Uniquely, we present a solution to this task capable of 1) matching images invariant to their semantic content; 2) robust to benign transformations (changes in quality, resolution, shape, etc.) commonly encountered as images are re-shared online. In order to formalize our research, a challenging benchmark, Attribution88, is collected for robust and practical image attribution. We then propose RepMix, our GAN fingerprinting technique based on representation mixing and a novel loss. We validate its capability of tracing the provenance of GAN-generated images invariant to the semantic content of the image and also robust to perturbations. We show our approach improves significantly from existing GAN fingerprinting works on both semantic generalization and robustness. Data and code are available at https://github.com/TuBui/image_attribution.

* Accepted at ECCV 2022; fix typo, add supmat 
Viaarxiv icon

SImProv: Scalable Image Provenance Framework for Robust Content Attribution

Jun 28, 2022
Alexander Black, Tu Bui, Simon Jenni, Zhifei Zhang, Viswanathan Swaminanthan, John Collomosse

Figure 1 for SImProv: Scalable Image Provenance Framework for Robust Content Attribution
Figure 2 for SImProv: Scalable Image Provenance Framework for Robust Content Attribution
Figure 3 for SImProv: Scalable Image Provenance Framework for Robust Content Attribution
Figure 4 for SImProv: Scalable Image Provenance Framework for Robust Content Attribution

We present SImProv - a scalable image provenance framework to match a query image back to a trusted database of originals and identify possible manipulations on the query. SImProv consists of three stages: a scalable search stage for retrieving top-k most similar images; a re-ranking and near-duplicated detection stage for identifying the original among the candidates; and finally a manipulation detection and visualization stage for localizing regions within the query that may have been manipulated to differ from the original. SImProv is robust to benign image transformations that commonly occur during online redistribution, such as artifacts due to noise and recompression degradation, as well as out-of-place transformations due to image padding, warping, and changes in size and shape. Robustness towards out-of-place transformations is achieved via the end-to-end training of a differentiable warping module within the comparator architecture. We demonstrate effective retrieval and manipulation detection over a dataset of 100 million images.

* Submitted to IEEE Transactions on Information Forensics and Security 
Viaarxiv icon

CoGS: Controllable Generation and Search from Sketch and Style

Mar 17, 2022
Cusuh Ham, Gemma Canet Tarres, Tu Bui, James Hays, Zhe Lin, John Collomosse

Figure 1 for CoGS: Controllable Generation and Search from Sketch and Style
Figure 2 for CoGS: Controllable Generation and Search from Sketch and Style
Figure 3 for CoGS: Controllable Generation and Search from Sketch and Style
Figure 4 for CoGS: Controllable Generation and Search from Sketch and Style

We present CoGS, a novel method for the style-conditioned, sketch-driven synthesis of images. CoGS enables exploration of diverse appearance possibilities for a given sketched object, enabling decoupled control over the structure and the appearance of the output. Coarse-grained control over object structure and appearance are enabled via an input sketch and an exemplar "style" conditioning image to a transformer-based sketch and style encoder to generate a discrete codebook representation. We map the codebook representation into a metric space, enabling fine-grained control over selection and interpolation between multiple synthesis options for a given image before generating the image via a vector quantized GAN (VQGAN) decoder. Our framework thereby unifies search and synthesis tasks, in that a sketch and style pair may be used to run an initial synthesis which may be refined via combination with similar results in a search corpus to produce an image more closely matching the user's intent. We show that our model, trained on the 125 object classes of our newly created Pseudosketches dataset, is capable of producing a diverse gamut of semantic content and appearance styles.

Viaarxiv icon

ARIA: Adversarially Robust Image Attribution for Content Provenance

Feb 25, 2022
Maksym Andriushchenko, Xiaoyang Rebecca Li, Geoffrey Oxholm, Thomas Gittings, Tu Bui, Nicolas Flammarion, John Collomosse

Figure 1 for ARIA: Adversarially Robust Image Attribution for Content Provenance
Figure 2 for ARIA: Adversarially Robust Image Attribution for Content Provenance
Figure 3 for ARIA: Adversarially Robust Image Attribution for Content Provenance
Figure 4 for ARIA: Adversarially Robust Image Attribution for Content Provenance

Image attribution -- matching an image back to a trusted source -- is an emerging tool in the fight against online misinformation. Deep visual fingerprinting models have recently been explored for this purpose. However, they are not robust to tiny input perturbations known as adversarial examples. First we illustrate how to generate valid adversarial images that can easily cause incorrect image attribution. Then we describe an approach to prevent imperceptible adversarial attacks on deep visual fingerprinting models, via robust contrastive learning. The proposed training procedure leverages training on $\ell_\infty$-bounded adversarial examples, it is conceptually simple and incurs only a small computational overhead. The resulting models are substantially more robust, are accurate even on unperturbed images, and perform well even over a database with millions of images. In particular, we achieve 91.6% standard and 85.1% adversarial recall under $\ell_\infty$-bounded perturbations on manipulated images compared to 80.1% and 0.0% from prior work. We also show that robustness generalizes to other types of imperceptible perturbations unseen during training. Finally, we show how to train an adversarially robust image comparator model for detecting editorial changes in matched images.

Viaarxiv icon