Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Image To Image Translation": models, code, and papers

Multi-Texture GAN: Exploring the Multi-Scale Texture Translation for Brain MR Images

Feb 14, 2021
Xiaobin Hu

Inter-scanner and inter-protocol discrepancy in MRI datasets are known to lead to significant quantification variability. Hence image-to-image or scanner-to-scanner translation is a crucial frontier in the area of medical image analysis with a lot of potential applications. Nonetheless, a significant percentage of existing algorithms cannot explicitly exploit and preserve texture details from target scanners and offers individual solutions towards specialized task-specific architectures. In this paper, we design a multi-scale texture transfer to enrich the reconstruction images with more details. Specifically, after calculating textural similarity, the multi-scale texture can adaptively transfer the texture information from target images or reference images to restored images. Different from the pixel-wise matching space as done by previous algorithms, we match texture features in a multi-scale scheme implemented in the neural space. The matching mechanism can exploit multi-scale neural transfer that encourages the model to grasp more semantic-related and lesion-related priors from the target or reference images. We evaluate our multi-scale texture GAN on three different tasks without any task-specific modifications: cross-protocol super-resolution of diffusion MRI, T1-Flair, and Flair-T2 modality translation. Our multi-texture GAN rehabilitates more high-resolution structures (i.e., edges and anatomy), texture (i.e., contrast and pixel intensities), and lesion information (i.e., tumor). The extensively quantitative and qualitative experiments demonstrate that our method achieves superior results in inter-protocol or inter-scanner translation over state-of-the-art methods.

  
Access Paper or Ask Questions

Efficient Rotation-Scaling-Translation Parameters Estimation Based on Fractal Image Model

Jul 04, 2015
M. Uss, B. Vozel, V. Lukin, K. Chehdi

This paper deals with area-based subpixel image registration under rotation-isometric scaling-translation transformation hypothesis. Our approach is based on a parametrical modeling of geometrically transformed textural image fragments and maximum likelihood estimation of transformation vector between them. Due to the parametrical approach based on the fractional Brownian motion modeling of the local fragments texture, the proposed estimator MLfBm (ML stands for "Maximum Likelihood" and fBm for "Fractal Brownian motion") has the ability to better adapt to real image texture content compared to other methods relying on universal similarity measures like mutual information or normalized correlation. The main benefits are observed when assumptions underlying the fBm model are fully satisfied, e.g. for isotropic normally distributed textures with stationary increments. Experiments on both simulated and real images and for high and weak correlation between registered images show that the MLfBm estimator offers significant improvement compared to other state-of-the-art methods. It reduces translation vector, rotation angle and scaling factor estimation errors by a factor of about 1.75...2 and it decreases probability of false match by up to 5 times. Besides, an accurate confidence interval for MLfBm estimates can be obtained from the Cramer-Rao lower bound on rotation-scaling-translation parameters estimation error. This bound depends on texture roughness, noise level in reference and template images, correlation between these images and geometrical transformation parameters.

* 42 pages, 8 figures, 7 tables. Journal paper 
  
Access Paper or Ask Questions

AttentionGAN: Unpaired Image-to-Image Translation using Attention-Guided Generative Adversarial Networks

Dec 28, 2019
Hao Tang, Hong Liu, Dan Xu, Philip H. S. Torr, Nicu Sebe

State-of-the-art methods in the unpaired image-to-image translation are capable of learning a mapping from a source domain to a target domain with unpaired image data. Though the existing methods have achieved promising results, they still produce unsatisfied artifacts, being able to convert low-level information while limited in transforming high-level semantics of input images. One possible reason is that generators do not have the ability to perceive the most discriminative semantic parts between the source and target domains, thus making the generated images low quality. In this paper, we propose a new Attention-Guided Generative Adversarial Networks (AttentionGAN) for the unpaired image-to-image translation task. AttentionGAN can identify the most discriminative semantic objects and minimize changes of unwanted parts for semantic manipulation problems without using extra data and models. The attention-guided generators in AttentionGAN are able to produce attention masks via a built-in attention mechanism, and then fuse the generation output with the attention masks to obtain high-quality target images. Accordingly, we also design a novel attention-guided discriminator which only considers attended regions. Extensive experiments are conducted on several generative tasks, demonstrating that the proposed model is effective to generate sharper and more realistic images compared with existing competitive models. The source code for the proposed AttentionGAN is available at https://github.com/Ha0Tang/AttentionGAN.

* An extended version of a paper published in IJCNN2019. arXiv admin note: substantial text overlap with arXiv:1903.12296. Add more results 
  
Access Paper or Ask Questions

Palette: Image-to-Image Diffusion Models

Nov 10, 2021
Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi

We introduce Palette, a simple and general framework for image-to-image translation using conditional diffusion models. On four challenging image-to-image translation tasks (colorization, inpainting, uncropping, and JPEG decompression), Palette outperforms strong GAN and regression baselines, and establishes a new state of the art. This is accomplished without task-specific hyper-parameter tuning, architecture customization, or any auxiliary loss, demonstrating a desirable degree of generality and flexibility. We uncover the impact of using $L_2$ vs. $L_1$ loss in the denoising diffusion objective on sample diversity, and demonstrate the importance of self-attention through empirical architecture studies. Importantly, we advocate a unified evaluation protocol based on ImageNet, and report several sample quality scores including FID, Inception Score, Classification Accuracy of a pre-trained ResNet-50, and Perceptual Distance against reference images for various baselines. We expect this standardized evaluation protocol to play a critical role in advancing image-to-image translation research. Finally, we show that a single generalist Palette model trained on 3 tasks (colorization, inpainting, JPEG decompression) performs as well or better than task-specific specialist counterparts.

  
Access Paper or Ask Questions

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

Mar 28, 2020
Runfa Chen, Wenbing Huang, Binghui Huang, Fuchun Sun, Bin Fang

Unsupervised image-to-image translation is a central task in computer vision. Current translation frameworks will abandon the discriminator once the training process is completed. This paper contends a novel role of the discriminator by reusing it for encoding the images of the target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied. The main issue in NICE-GAN is the coupling of translation with discrimination along the encoder, which could incur training inconsistency when we play the min-max game via GAN. To tackle this issue, we develop a decoupled training strategy by which the encoder is only trained when maximizing the adversary loss while keeping frozen otherwise. Extensive experiments on four popular benchmarks demonstrate the superior performance of NICE-GAN over state-of-the-art methods in terms of FID, KID, and also human preference. Comprehensive ablation studies are also carried out to isolate the validity of each proposed component. Our codes are available at https://github.com/alpc91/NICE-GAN-pytorch.

* Accepted to CVPR 2020 
  
Access Paper or Ask Questions

Reusing Discriminators for Encoding Towards Unsupervised Image-to-Image Translation

Mar 03, 2020
Runfa Chen, Wenbing Huang, Binghui Huang, Fuchun Sun, Bin Fang

Unsupervised image-to-image translation is a central task in computer vision. Current translation frameworks will abandon the discriminator once the training process is completed. This paper contends a novel role of the discriminator by reusing it for encoding the images of the target domain. The proposed architecture, termed as NICE-GAN, exhibits two advantageous patterns over previous approaches: First, it is more compact since no independent encoding component is required; Second, this plug-in encoder is directly trained by the adversary loss, making it more informative and trained more effectively if a multi-scale discriminator is applied. The main issue in NICE-GAN is the coupling of translation with discrimination along the encoder, which could incur training inconsistency when we play the min-max game via GAN. To tackle this issue, we develop a decoupled training strategy by which the encoder is only trained when maximizing the adversary loss while keeping frozen otherwise. Extensive experiments on four popular benchmarks demonstrate the superior performance of NICE-GAN over state-of-the-art methods in terms of FID, KID, and also human preference. Comprehensive ablation studies are also carried out to isolate the validity of each proposed component. Our codes are available at https://github.com/alpc91/NICE-GAN-pytorch.

* Accepted to CVPR 2020 
  
Access Paper or Ask Questions
<<
21
22
23
24
25
26
27
28
29
30
31
32
33
>>