Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Image To Image Translation": models, code, and papers

Membership Privacy Protection for Image Translation Models via Adversarial Knowledge Distillation

Mar 10, 2022
Saeed Ranjbar Alvar, Lanjun Wang, Jian Pei, Yong Zhang

Image-to-image translation models are shown to be vulnerable to the Membership Inference Attack (MIA), in which the adversary's goal is to identify whether a sample is used to train the model or not. With daily increasing applications based on image-to-image translation models, it is crucial to protect the privacy of these models against MIAs. We propose adversarial knowledge distillation (AKD) as a defense method against MIAs for image-to-image translation models. The proposed method protects the privacy of the training samples by improving the generalizability of the model. We conduct experiments on the image-to-image translation models and show that AKD achieves the state-of-the-art utility-privacy tradeoff by reducing the attack performance up to 38.9% compared with the regular training model at the cost of a slight drop in the quality of the generated output images. The experimental results also indicate that the models trained by AKD generalize better than the regular training models. Furthermore, compared with existing defense methods, the results show that at the same privacy protection level, image translation models trained by AKD generate outputs with higher quality; while at the same quality of outputs, AKD enhances the privacy protection over 30%.

Access Paper or Ask Questions

Bridging the gap between paired and unpaired medical image translation

Oct 15, 2021
Pauliina Paavilainen, Saad Ullah Akram, Juho Kannala

Medical image translation has the potential to reduce the imaging workload, by removing the need to capture some sequences, and to reduce the annotation burden for developing machine learning methods. GANs have been used successfully to translate images from one domain to another, such as MR to CT. At present, paired data (registered MR and CT images) or extra supervision (e.g. segmentation masks) is needed to learn good translation models. Registering multiple modalities or annotating structures within each of them is a tedious and laborious task. Thus, there is a need to develop improved translation methods for unpaired data. Here, we introduce modified pix2pix models for tasks CT$\rightarrow$MR and MR$\rightarrow$CT, trained with unpaired CT and MR data, and MRCAT pairs generated from the MR scans. The proposed modifications utilize the paired MR and MRCAT images to ensure good alignment between input and translated images, and unpaired CT images ensure the MR$\rightarrow$CT model produces realistic-looking CT and CT$\rightarrow$MR model works well with real CT as input. The proposed pix2pix variants outperform baseline pix2pix, pix2pixHD and CycleGAN in terms of FID and KID, and generate more realistic looking CT and MR translations.

* Deep Generative Models for MICCAI (DGM4MICCAI) workshop 2021 
Access Paper or Ask Questions

Galaxy Image Translation with Semi-supervised Noise-reconstructed Generative Adversarial Networks

Jan 19, 2021
Qiufan Lin, Dominique Fouchez, Jérôme Pasquet

Image-to-image translation with Deep Learning neural networks, particularly with Generative Adversarial Networks (GANs), is one of the most powerful methods for simulating astronomical images. However, current work is limited to utilizing paired images with supervised translation, and there has been rare discussion on reconstructing noise background that encodes instrumental and observational effects. These limitations might be harmful for subsequent scientific applications in astrophysics. Therefore, we aim to develop methods for using unpaired images and preserving noise characteristics in image translation. In this work, we propose a two-way image translation model using GANs that exploits both paired and unpaired images in a semi-supervised manner, and introduce a noise emulating module that is able to learn and reconstruct noise characterized by high-frequency features. By experimenting on multi-band galaxy images from the Sloan Digital Sky Survey (SDSS) and the Canada France Hawaii Telescope Legacy Survey (CFHT), we show that our method recovers global and local properties effectively and outperforms benchmark image translation models. To our best knowledge, this work is the first attempt to apply semi-supervised methods and noise reconstruction techniques in astrophysical studies.

* Accepted at ICPR 2020 
Access Paper or Ask Questions

Harmonic Unpaired Image-to-image Translation

Feb 26, 2019
Rui Zhang, Tomas Pfister, Jia Li

The recent direction of unpaired image-to-image translation is on one hand very exciting as it alleviates the big burden in obtaining label-intensive pixel-to-pixel supervision, but it is on the other hand not fully satisfactory due to the presence of artifacts and degenerated transformations. In this paper, we take a manifold view of the problem by introducing a smoothness term over the sample graph to attain harmonic functions to enforce consistent mappings during the translation. We develop HarmonicGAN to learn bi-directional translations between the source and the target domains. With the help of similarity-consistency, the inherent self-consistency property of samples can be maintained. Distance metrics defined on two types of features including histogram and CNN are exploited. Under an identical problem setting as CycleGAN, without additional manual inputs and only at a small training-time cost, HarmonicGAN demonstrates a significant qualitative and quantitative improvement over the state of the art, as well as improved interpretability. We show experimental results in a number of applications including medical imaging, object transfiguration, and semantic labeling. We outperform the competing methods in all tasks, and for a medical imaging task in particular our method turns CycleGAN from a failure to a success, halving the mean-squared error, and generating images that radiologists prefer over competing methods in 95% of cases.

Access Paper or Ask Questions

High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network

May 19, 2021
Jie Liang, Hui Zeng, Lei Zhang

Existing image-to-image translation (I2IT) methods are either constrained to low-resolution images or long inference time due to their heavy computational burden on the convolution of high-resolution feature maps. In this paper, we focus on speeding-up the high-resolution photorealistic I2IT tasks based on closed-form Laplacian pyramid decomposition and reconstruction. Specifically, we reveal that the attribute transformations, such as illumination and color manipulation, relate more to the low-frequency component, while the content details can be adaptively refined on high-frequency components. We consequently propose a Laplacian Pyramid Translation Network (LPTN) to simultaneously perform these two tasks, where we design a lightweight network for translating the low-frequency component with reduced resolution and a progressive masking strategy to efficiently refine the high-frequency ones. Our model avoids most of the heavy computation consumed by processing high-resolution feature maps and faithfully preserves the image details. Extensive experimental results on various tasks demonstrate that the proposed method can translate 4K images in real-time using one normal GPU while achieving comparable transformation performance against existing methods. Datasets and codes are available:

* To appear at CVPR 2021 
Access Paper or Ask Questions

EDIT: Exemplar-Domain Aware Image-to-Image Translation

Nov 24, 2019
Yuanbin Fu, Jiayi Ma, Lin Ma, Xiaojie Guo

Image-to-image translation is to convert an image of the certain style to another of the target style with the content preserved. A desired translator should be capable to generate diverse results in a controllable (many-to-many) fashion. To this end, we design a novel generative adversarial network, namely exemplar-domain aware image-to-image translator (EDIT for short). The principle behind is that, for images from multiple domains, the content features can be obtained by a uniform extractor, while (re-)stylization is achieved by mapping the extracted features specifically to different purposes (domains and exemplars). The generator of our EDIT comprises of a part of blocks configured by shared parameters, and the rest by varied parameters exported by an exemplar-domain aware parameter network. In addition, a discriminator is equipped during the training phase to guarantee the output satisfying the distribution of the target domain. Our EDIT can flexibly and effectively work on multiple domains and arbitrary exemplars in a unified neat model. We conduct experiments to show the efficacy of our design, and reveal its advances over other state-of-the-art methods both quantitatively and qualitatively.

* 9 pages, 7 figures 
Access Paper or Ask Questions

Translating Images into Maps

Oct 03, 2021
Avishkar Saha, Oscar Mendez Maldonado, Chris Russell, Richard Bowden

We approach instantaneous mapping, converting images to a top-down view of the world, as a translation problem. We show how a novel form of transformer network can be used to map from images and video directly to an overhead map or bird's-eye-view (BEV) of the world, in a single end-to-end network. We assume a 1-1 correspondence between a vertical scanline in the image, and rays passing through the camera location in an overhead map. This lets us formulate map generation from an image as a set of sequence-to-sequence translations. Posing the problem as translation allows the network to use the context of the image when interpreting the role of each pixel. This constrained formulation, based upon a strong physical grounding of the problem, leads to a restricted transformer network that is convolutional in the horizontal direction only. The structure allows us to make efficient use of data when training, and obtains state-of-the-art results for instantaneous mapping of three large-scale datasets, including a 15% and 30% relative gain against existing best performing methods on the nuScenes and Argoverse datasets, respectively. We make our code available on

* 7 pages 
Access Paper or Ask Questions

Improving Endoscopic Decision Support Systems by Translating Between Imaging Modalities

Apr 27, 2020
Georg Wimmer, Michael Gadermayr, Andreas Vécsei, Andreas Uhl

Novel imaging technologies raise many questions concerning the adaptation of computer-aided decision support systems. Classification models either need to be adapted or even newly trained from scratch to exploit the full potential of enhanced techniques. Both options typically require the acquisition of new labeled training data. In this work we investigate the applicability of image-to-image translation to endoscopic images showing different imaging modalities, namely conventional white-light and narrow-band imaging. In a study on computer-aided celiac disease diagnosis, we explore whether image-to-image translation is capable of effectively performing the translation between the domains. We investigate if models can be trained on virtual (or a mixture of virtual and real) samples to improve overall accuracy in a setting with limited labeled training data. Finally, we also ask whether a translation of testing images to another domain is capable of improving accuracy by exploiting the enhanced imaging characteristics.

* Submitted to MICCAI 2020 
Access Paper or Ask Questions

Exploring Patch-wise Semantic Relation for Contrastive Learning in Image-to-Image Translation Tasks

Mar 03, 2022
Chanyong Jung, Gihyun Kwon, Jong Chul Ye

Recently, contrastive learning-based image translation methods have been proposed, which contrasts different spatial locations to enhance the spatial correspondence. However, the methods often ignore the diverse semantic relation within the images. To address this, here we propose a novel semantic relation consistency (SRC) regularization along with the decoupled contrastive learning, which utilize the diverse semantics by focusing on the heterogeneous semantics between the image patches of a single image. To further improve the performance, we present a hard negative mining by exploiting the semantic relation. We verified our method for three tasks: single-modal and multi-modal image translations, and GAN compression task for image translation. Experimental results confirmed the state-of-art performance of our method in all the three tasks.

* CVPR 2022 
Access Paper or Ask Questions

Cross-Domain Medical Image Translation by Shared Latent Gaussian Mixture Model

Jul 14, 2020
Yingying Zhu, Youbao Tang, Yuxing Tang, Daniel C. Elton, Sungwon Lee, Perry J. Pickhardt, Ronald M. Summers

Current deep learning based segmentation models often generalize poorly between domains due to insufficient training data. In real-world clinical applications, cross-domain image analysis tools are in high demand since medical images from different domains are often needed to achieve a precise diagnosis. An important example in radiology is generalizing from non-contrast CT to contrast enhanced CTs. Contrast enhanced CT scans at different phases are used to enhance certain pathologies or organs. Many existing cross-domain image-to-image translation models have been shown to improve cross-domain segmentation of large organs. However, such models lack the ability to preserve fine structures during the translation process, which is significant for many clinical applications, such as segmenting small calcified plaques in the aorta and pelvic arteries. In order to preserve fine structures during medical image translation, we propose a patch-based model using shared latent variables from a Gaussian mixture model. We compare our image translation framework to several state-of-the-art methods on cross-domain image translation and show our model does a better job preserving fine structures. The superior performance of our model is verified by performing two tasks with the translated images - detection and segmentation of aortic plaques and pancreas segmentation. We expect the utility of our framework will extend to other problems beyond segmentation due to the improved quality of the generated images and enhanced ability to preserve small structures.

* Accepted to Medical Image Computing and Computer Assisted Intervention (MICCAI) 2020 
Access Paper or Ask Questions