Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yunjey Choi

StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

Apr 30, 2021

Hyunsu Kim, Yunjey Choi, Junho Kim, Sungjoo Yoo, Youngjung Uh

Figure 1 for StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

Figure 2 for StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

Figure 3 for StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

Figure 4 for StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

Abstract:Generative adversarial networks (GANs) synthesize realistic images from random latent vectors. Although manipulating the latent vectors controls the synthesized outputs, editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder. We propose StyleMapGAN: the intermediate latent space has spatial dimensions, and a spatially variant modulation replaces AdaIN. It makes the embedding through an encoder more accurate than existing optimization-based methods while maintaining the properties of GANs. Experimental results demonstrate that our method significantly outperforms state-of-the-art models in various image manipulation tasks such as local editing and image interpolation. Last but not least, conventional editing methods on GANs are still valid on our StyleMapGAN. Source code is available at https://github.com/naver-ai/StyleMapGAN.

* Accepted to CVPR 2021

Via

Access Paper or Ask Questions

Rethinking the Truly Unsupervised Image-to-Image Translation

Jun 11, 2020

Kyungjune Baek, Yunjey Choi, Youngjung Uh, Jaejun Yoo, Hyunjung Shim

Figure 1 for Rethinking the Truly Unsupervised Image-to-Image Translation

Figure 2 for Rethinking the Truly Unsupervised Image-to-Image Translation

Figure 3 for Rethinking the Truly Unsupervised Image-to-Image Translation

Figure 4 for Rethinking the Truly Unsupervised Image-to-Image Translation

Abstract:Every recent image-to-image translation model uses either image-level (i.e. input-output pairs) or set-level (i.e. domain labels) supervision at minimum. However, even the set-level supervision can be a serious bottleneck for data collection in practice. In this paper, we tackle image-to-image translation in a fully unsupervised setting, i.e., neither paired images nor domain labels. To this end, we propose the truly unsupervised image-to-image translation method (TUNIT) that simultaneously learns to separate image domains via an information-theoretic approach and generate corresponding images using the estimated domain labels. Experimental results on various datasets show that the proposed method successfully separates domains and translates images across those domains. In addition, our model outperforms existing set-level supervised methods under a semi-supervised setting, where a subset of domain labels is provided. The source code is available at https://github.com/clovaai/tunit

Via

Access Paper or Ask Questions

Reliable Fidelity and Diversity Metrics for Generative Models

Feb 23, 2020

Muhammad Ferjad Naeem, Seong Joon Oh, Youngjung Uh, Yunjey Choi, Jaejun Yoo

Figure 1 for Reliable Fidelity and Diversity Metrics for Generative Models

Figure 2 for Reliable Fidelity and Diversity Metrics for Generative Models

Figure 3 for Reliable Fidelity and Diversity Metrics for Generative Models

Figure 4 for Reliable Fidelity and Diversity Metrics for Generative Models

Abstract:Devising indicative evaluation metrics for the image generation task remains an open problem. The most widely used metric for measuring the similarity between real and generated images has been the Fr\'echet Inception Distance (FID) score. Because it does not differentiate the fidelity and diversity aspects of the generated images, recent papers have introduced variants of precision and recall metrics to diagnose those properties separately. In this paper, we show that even the latest version of the precision and recall metrics are not reliable yet. For example, they fail to detect the match between two identical distributions, they are not robust against outliers, and the evaluation hyperparameters are selected arbitrarily. We propose density and coverage metrics that solve the above issues. We analytically and experimentally show that density and coverage provide more interpretable and reliable signals for practitioners than the existing metrics. Code: https://github.com/clovaai/generative-evaluation-prdc.

* First two authors have contributed equally

Via

Access Paper or Ask Questions

StarGAN v2: Diverse Image Synthesis for Multiple Domains

Dec 04, 2019

Yunjey Choi, Youngjung Uh, Jaejun Yoo, Jung-Woo Ha

Figure 1 for StarGAN v2: Diverse Image Synthesis for Multiple Domains

Figure 2 for StarGAN v2: Diverse Image Synthesis for Multiple Domains

Figure 3 for StarGAN v2: Diverse Image Synthesis for Multiple Domains

Figure 4 for StarGAN v2: Diverse Image Synthesis for Multiple Domains

Abstract:A good image-to-image translation model should learn a mapping between different visual domains while satisfying the following properties: 1) diversity of generated images and 2) scalability over multiple domains. Existing methods address either of the issues, having limited diversity or multiple models for all domains. We propose StarGAN v2, a single framework that tackles both and shows significantly improved results over the baselines. Experiments on CelebA-HQ and a new animal faces dataset (AFHQ) validate our superiority in terms of visual quality, diversity, and scalability. To better assess image-to-image translation models, we release AFHQ, high-quality animal faces with large inter- and intra-domain differences. The code, pretrained models, and dataset can be found at https://github.com/clovaai/stargan-v2.

Via

Access Paper or Ask Questions

StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Sep 21, 2018

Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, Jaegul Choo

Figure 1 for StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Figure 2 for StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Figure 3 for StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Figure 4 for StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Abstract:Recent studies have shown remarkable success in image-to-image translation for two domains. However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model. Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network. This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.

* IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 8789-8797
* Accepted to CVPR 2018 (Oral)

Via

Access Paper or Ask Questions