Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:photo style transfer

Sketch3T: Test-Time Training for Zero-Shot SBIR

Mar 28, 2022

Aneeshan Sain, Ayan Kumar Bhunia, Vaishnav Potlapalli, Pinaki Nath Chowdhury, Tao Xiang, Yi-Zhe Song

Figure 1 for Sketch3T: Test-Time Training for Zero-Shot SBIR

Figure 2 for Sketch3T: Test-Time Training for Zero-Shot SBIR

Figure 3 for Sketch3T: Test-Time Training for Zero-Shot SBIR

Figure 4 for Sketch3T: Test-Time Training for Zero-Shot SBIR

Abstract:Zero-shot sketch-based image retrieval typically asks for a trained model to be applied as is to unseen categories. In this paper, we question to argue that this setup by definition is not compatible with the inherent abstract and subjective nature of sketches, i.e., the model might transfer well to new categories, but will not understand sketches existing in different test-time distribution as a result. We thus extend ZS-SBIR asking it to transfer to both categories and sketch distributions. Our key contribution is a test-time training paradigm that can adapt using just one sketch. Since there is no paired photo, we make use of a sketch raster-vector reconstruction module as a self-supervised auxiliary task. To maintain the fidelity of the trained cross-modal joint embedding during test-time update, we design a novel meta-learning based training paradigm to learn a separation between model updates incurred by this auxiliary task from those off the primary objective of discriminative learning. Extensive experiments show our model to outperform state of-the-arts, thanks to the proposed test-time adaption that not only transfers to new categories but also accommodates to new sketching styles.

* 10 pages, 5 figures. Accepted in CVPR 2022

Via

Access Paper or Ask Questions

Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks

Apr 26, 2021

Sebastian Szyller, Vasisht Duddu, Tommi Gröndahl, N. Asokan

Figure 1 for Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks

Figure 2 for Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks

Figure 3 for Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks

Figure 4 for Good Artists Copy, Great Artists Steal: Model Extraction Attacks Against Image Translation Generative Adversarial Networks

Abstract:Machine learning models are typically made available to potential client users via inference APIs. Model extraction attacks occur when a malicious client uses information gleaned from queries to the inference API of a victim model $F_V$ to build a surrogate model $F_A$ that has comparable functionality. Recent research has shown successful model extraction attacks against image classification, and NLP models. In this paper, we show the first model extraction attack against real-world generative adversarial network (GAN) image translation models. We present a framework for conducting model extraction attacks against image translation models, and show that the adversary can successfully extract functional surrogate models. The adversary is not required to know $F_V$'s architecture or any other information about it beyond its intended image translation task, and queries $F_V$'s inference interface using data drawn from the same domain as the training data for $F_V$. We evaluate the effectiveness of our attacks using three different instances of two popular categories of image translation: (1) Selfie-to-Anime and (2) Monet-to-Photo (image style transfer), and (3) Super-Resolution (super resolution). Using standard performance metrics for GANs, we show that our attacks are effective in each of the three cases -- the differences between $F_V$ and $F_A$, compared to the target are in the following ranges: Selfie-to-Anime: FID $13.36-68.66$, Monet-to-Photo: FID $3.57-4.40$, and Super-Resolution: SSIM: $0.06-0.08$ and PSNR: $1.43-4.46$. Furthermore, we conducted a large scale (125 participants) user study on Selfie-to-Anime and Monet-to-Photo to show that human perception of the images produced by the victim and surrogate models can be considered equivalent, within an equivalence bound of Cohen's $d=0.3$.

* 9 pages, 7 figures

Via

Access Paper or Ask Questions

Photo style transfer with consistency losses

May 09, 2020

Xu Yao, Gilles Puy, Patrick Pérez

Figure 1 for Photo style transfer with consistency losses

Figure 2 for Photo style transfer with consistency losses

Figure 3 for Photo style transfer with consistency losses

Figure 4 for Photo style transfer with consistency losses

Abstract:We address the problem of style transfer between two photos and propose a new way to preserve photorealism. Using the single pair of photos available as input, we train a pair of deep convolution networks (convnets), each of which transfers the style of one photo to the other. To enforce photorealism, we introduce a content preserving mechanism by combining a cycle-consistency loss with a self-consistency loss. Experimental results show that this method does not suffer from typical artifacts observed in methods working in the same settings. We then further analyze some properties of these trained convnets. First, we notice that they can be used to stylize other unseen images with same known style. Second, we show that retraining only a small subset of the network parameters can be sufficient to adapt these convnets to new styles.

* In 2019 IEEE International Conference on Image Processing (ICIP) (pp. 2314-2318). IEEE

Via

Access Paper or Ask Questions

Data Augmentation using Random Image Cropping for High-resolution Virtual Try-On (VITON-CROP)

Nov 16, 2021

Taewon Kang, Sunghyun Park, Seunghwan Choi, Jaegul Choo

Figure 1 for Data Augmentation using Random Image Cropping for High-resolution Virtual Try-On (VITON-CROP)

Figure 2 for Data Augmentation using Random Image Cropping for High-resolution Virtual Try-On (VITON-CROP)

Figure 3 for Data Augmentation using Random Image Cropping for High-resolution Virtual Try-On (VITON-CROP)

Abstract:Image-based virtual try-on provides the capacity to transfer a clothing item onto a photo of a given person, which is usually accomplished by warping the item to a given human pose and adjusting the warped item to the person. However, the results of real-world synthetic images (e.g., selfies) from the previous method is not realistic because of the limitations which result in the neck being misrepresented and significant changes to the style of the garment. To address these challenges, we propose a novel method to solve this unique issue, called VITON-CROP. VITON-CROP synthesizes images more robustly when integrated with random crop augmentation compared to the existing state-of-the-art virtual try-on models. In the experiments, we demonstrate that VITON-CROP is superior to VITON-HD both qualitatively and quantitatively.

* 4 pages, 3 figures

Via

Access Paper or Ask Questions

Filter Style Transfer between Photos

Jul 15, 2020

Jonghwa Yim, Jisung Yoo, Won-joon Do, Beomsu Kim, Jihwan Choe

Figure 1 for Filter Style Transfer between Photos

Figure 2 for Filter Style Transfer between Photos

Figure 3 for Filter Style Transfer between Photos

Figure 4 for Filter Style Transfer between Photos

Abstract:Over the past few years, image-to-image style transfer has risen to the frontiers of neural image processing. While conventional methods were successful in various tasks such as color and texture transfer between images, none could effectively work with the custom filter effects that are applied by users through various platforms like Instagram. In this paper, we introduce a new concept of style transfer, Filter Style Transfer (FST). Unlike conventional style transfer, new technique FST can extract and transfer custom filter style from a filtered style image to a content image. FST first infers the original image from a filtered reference via image-to-image translation. Then it estimates filter parameters from the difference between them. To resolve the ill-posed nature of reconstructing the original image from the reference, we represent each pixel color of an image to class mean and deviation. Besides, to handle the intra-class color variation, we propose an uncertainty based weighted least square method for restoring an original image. To the best of our knowledge, FST is the first style transfer method that can transfer custom filter effects between FHD image under 2ms on a mobile device without any textual context loss.

* ECCV (Spotlight) 2020

Via

Access Paper or Ask Questions

Spatial Content Alignment For Pose Transfer

Mar 31, 2021

Wing-Yin Yu, Lai-Man Po, Yuzhi Zhao, Jingjing Xiong, Kin-Wai Lau

Figure 1 for Spatial Content Alignment For Pose Transfer

Figure 2 for Spatial Content Alignment For Pose Transfer

Figure 3 for Spatial Content Alignment For Pose Transfer

Figure 4 for Spatial Content Alignment For Pose Transfer

Abstract:Due to unreliable geometric matching and content misalignment, most conventional pose transfer algorithms fail to generate fine-trained person images. In this paper, we propose a novel framework Spatial Content Alignment GAN (SCAGAN) which aims to enhance the content consistency of garment textures and the details of human characteristics. We first alleviate the spatial misalignment by transferring the edge content to the target pose in advance. Secondly, we introduce a new Content-Style DeBlk which can progressively synthesize photo-realistic person images based on the appearance features of the source image, the target pose heatmap and the prior transferred content in edge domain. We compare the proposed framework with several state-of-the-art methods to show its superiority in quantitative and qualitative analysis. Moreover, detailed ablation study results demonstrate the efficacy of our contributions. Codes are publicly available at github.com/rocketappslab/SCA-GAN.

* IEEE International Conference on Multimedia and Expo (ICME) 2021 Oral

Via

Access Paper or Ask Questions

Deep Preset: Blending and Retouching Photos with Color Style Transfer

Jul 21, 2020

Man M. Ho, Jinjia Zhou

Figure 1 for Deep Preset: Blending and Retouching Photos with Color Style Transfer

Figure 2 for Deep Preset: Blending and Retouching Photos with Color Style Transfer

Figure 3 for Deep Preset: Blending and Retouching Photos with Color Style Transfer

Figure 4 for Deep Preset: Blending and Retouching Photos with Color Style Transfer

Abstract:End-users, without knowledge in photography, desire to beautify their photos to have a similar color style as a well-retouched reference. However, recent works in image style transfer are overused. They usually synthesize undesirable results due to transferring exact colors to the wrong destination. It becomes even worse in sensitive cases such as portraits. In this work, we concentrate on learning low-level image transformation, especially color-shifting methods, rather than mixing contextual features, then present a novel scheme to train color style transfer with ground-truth. Furthermore, we propose a color style transfer named Deep Preset. It is designed to 1) generalize the features representing the color transformation from content with natural colors to retouched reference, then blend it into the contextual features of content, 2) predict hyper-parameters (settings or preset) of the applied low-level color transformation methods, 3) stylize content to have a similar color style as reference. We script Lightroom, a powerful tool in editing photos, to generate 600,000 training samples using 1,200 images from the Flick2K dataset and 500 user-generated presets with 69 settings. Experimental results show that our Deep Preset outperforms the previous works in color style transfer quantitatively and qualitatively.

* Our work is available at https://minhmanho.github.io/deep_preset

Via

Access Paper or Ask Questions

Bridging Unpaired Facial Photos And Sketches By Line-drawings

Feb 25, 2021

Meimei Shang, Fei Gao, Xiang Li, Jingjie Zhu, Lingna Dai

Figure 1 for Bridging Unpaired Facial Photos And Sketches By Line-drawings

Figure 2 for Bridging Unpaired Facial Photos And Sketches By Line-drawings

Figure 3 for Bridging Unpaired Facial Photos And Sketches By Line-drawings

Figure 4 for Bridging Unpaired Facial Photos And Sketches By Line-drawings

Abstract:In this paper, we propose a novel method to learn face sketch synthesis models by using unpaired data. Our main idea is bridging the photo domain $\mathcal{X}$ and the sketch domain $Y$ by using the line-drawing domain $\mathcal{Z}$. Specially, we map both photos and sketches to line-drawings by using a neural style transfer method, i.e. $F: \mathcal{X}/\mathcal{Y} \mapsto \mathcal{Z}$. Consequently, we obtain \textit{pseudo paired data} $(\mathcal{Z}, \mathcal{Y})$, and can learn the mapping $G:\mathcal{Z} \mapsto \mathcal{Y}$ in a supervised learning manner. In the inference stage, given a facial photo, we can first transfer it to a line-drawing and then to a sketch by $G \circ F$. Additionally, we propose a novel stroke loss for generating different types of strokes. Our method, termed sRender, accords well with human artists' rendering process. Experimental results demonstrate that sRender can generate multi-style sketches, and significantly outperforms existing unpaired image-to-image translation methods.

* accepted by ICASSP2021

Via

Access Paper or Ask Questions

UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

Aug 13, 2021

D. Y. Rao, X. J. Wu, H. Li, J. Kittler, T. Y. Xu

Figure 1 for UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

Figure 2 for UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

Figure 3 for UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

Figure 4 for UMFA: A photorealistic style transfer method based on U-Net and multi-layer feature aggregation

Abstract:In this paper, we propose a photorealistic style transfer network to emphasize the natural effect of photorealistic image stylization. In general, distortion of the image content and lacking of details are two typical issues in the style transfer field. To this end, we design a novel framework employing the U-Net structure to maintain the rich spatial clues, with a multi-layer feature aggregation (MFA) method to simultaneously provide the details obtained by the shallow layers in the stylization processing. In particular, an encoder based on the dense block and a decoder form a symmetrical structure of U-Net are jointly staked to realize an effective feature extraction and image reconstruction. Besides, a transfer module based on MFA and "adaptive instance normalization" (AdaIN) is inserted in the skip connection positions to achieve the stylization. Accordingly, the stylized image possesses the texture of a real photo and preserves rich content details without introducing any mask or post-processing steps. The experimental results on public datasets demonstrate that our method achieves a more faithful structural similarity with a lower style loss, reflecting the effectiveness and merit of our approach.

Via

Access Paper or Ask Questions

SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing

Nov 30, 2021

Jing Shi, Ning Xu, Haitian Zheng, Alex Smith, Jiebo Luo, Chenliang Xu

Figure 1 for SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing

Figure 2 for SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing

Figure 3 for SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing

Figure 4 for SpaceEdit: Learning a Unified Editing Space for Open-Domain Image Editing

Abstract:Recently, large pretrained models (e.g., BERT, StyleGAN, CLIP) have shown great knowledge transfer and generalization capability on various downstream tasks within their domains. Inspired by these efforts, in this paper we propose a unified model for open-domain image editing focusing on color and tone adjustment of open-domain images while keeping their original content and structure. Our model learns a unified editing space that is more semantic, intuitive, and easy to manipulate than the operation space (e.g., contrast, brightness, color curve) used in many existing photo editing softwares. Our model belongs to the image-to-image translation framework which consists of an image encoder and decoder, and is trained on pairs of before- and after-images to produce multimodal outputs. We show that by inverting image pairs into latent codes of the learned editing space, our model can be leveraged for various downstream editing tasks such as language-guided image editing, personalized editing, editing-style clustering, retrieval, etc. We extensively study the unique properties of the editing space in experiments and demonstrate superior performance on the aforementioned tasks.

Via

Access Paper or Ask Questions

Topic:photo style transfer

Papers and Code