Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Liqing Zhang

Deep Image Harmonization with Learnable Augmentation

Aug 01, 2023

Li Niu, Junyan Cao, Wenyan Cong, Liqing Zhang

Abstract:The goal of image harmonization is adjusting the foreground appearance in a composite image to make the whole image harmonious. To construct paired training images, existing datasets adopt different ways to adjust the illumination statistics of foregrounds of real images to produce synthetic composite images. However, different datasets have considerable domain gap and the performances on small-scale datasets are limited by insufficient training data. In this work, we explore learnable augmentation to enrich the illumination diversity of small-scale datasets for better harmonization performance. In particular, our designed SYthetic COmposite Network (SycoNet) takes in a real image with foreground mask and a random vector to learn suitable color transformation, which is applied to the foreground of this real image to produce a synthetic composite image. Comprehensive experiments demonstrate the effectiveness of our proposed learnable augmentation for image harmonization. The code of SycoNet is released at https://github.com/bcmi/SycoNet-Adaptive-Image-Harmonization.

* Accepted by ICCV 2023

Via

Access Paper or Ask Questions

Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

Aug 01, 2023

Li Niu, Linfeng Tan, Xinhao Tao, Junyan Cao, Fengjun Guo, Teng Long, Liqing Zhang

Figure 1 for Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

Figure 2 for Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

Figure 3 for Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

Figure 4 for Deep Image Harmonization with Globally Guided Feature Transformation and Relation Distillation

Abstract:Given a composite image, image harmonization aims to adjust the foreground illumination to be consistent with background. Previous methods have explored transforming foreground features to achieve competitive performance. In this work, we show that using global information to guide foreground feature transformation could achieve significant improvement. Besides, we propose to transfer the foreground-background relation from real images to composite images, which can provide intermediate supervision for the transformed encoder features. Additionally, considering the drawbacks of existing harmonization datasets, we also contribute a ccHarmony dataset which simulates the natural illumination variation. Extensive experiments on iHarmony4 and our contributed dataset demonstrate the superiority of our method. Our ccHarmony dataset is released at https://github.com/bcmi/Image-Harmonization-Dataset-ccHarmony.

* Accepted by ICCV 2023

Via

Access Paper or Ask Questions

WeditGAN: Few-shot Image Generation via Latent Space Relocation

May 11, 2023

Yuxuan Duan, Li Niu, Yan Hong, Liqing Zhang

Abstract:In few-shot image generation, directly training GAN models on just a handful of images faces the risk of overfitting. A popular solution is to transfer the models pretrained on large source domains to small target ones. In this work, we introduce WeditGAN, which realizes model transfer by editing the intermediate latent codes $w$ in StyleGANs with learned constant offsets ($\Delta w$), discovering and constructing target latent spaces via simply relocating the distribution of source latent spaces. The established one-to-one mapping between latent spaces can naturally prevents mode collapse and overfitting. Besides, we also propose variants of WeditGAN to further enhance the relocation process by regularizing the direction or finetuning the intensity of $\Delta w$. Experiments on a collection of widely used source/target datasets manifest the capability of WeditGAN in generating realistic and diverse images, which is simple yet highly effective in the research area of few-shot image generation.

Via

Access Paper or Ask Questions

Few-Shot Defect Image Generation via Defect-Aware Feature Manipulation

Mar 04, 2023

Yuxuan Duan, Yan Hong, Li Niu, Liqing Zhang

Abstract:The performances of defect inspection have been severely hindered by insufficient defect images in industries, which can be alleviated by generating more samples as data augmentation. We propose the first defect image generation method in the challenging few-shot cases. Given just a handful of defect images and relatively more defect-free ones, our goal is to augment the dataset with new defect images. Our method consists of two training stages. First, we train a data-efficient StyleGAN2 on defect-free images as the backbone. Second, we attach defect-aware residual blocks to the backbone, which learn to produce reasonable defect masks and accordingly manipulate the features within the masked regions by training the added modules on limited defect images. Extensive experiments on MVTec AD dataset not only validate the effectiveness of our method in generating realistic and diverse defect images, but also manifest the benefits it brings to downstream defect inspection tasks. Codes are available at https://github.com/Ldhlwh/DFMGAN.

* Accepted by AAAI 2023

Via

Access Paper or Ask Questions

Weak-shot Semantic Segmentation via Dual Similarity Transfer

Oct 05, 2022

Junjie Chen, Li Niu, Siyuan Zhou, Jianlou Si, Chen Qian, Liqing Zhang

Figure 1 for Weak-shot Semantic Segmentation via Dual Similarity Transfer

Figure 2 for Weak-shot Semantic Segmentation via Dual Similarity Transfer

Figure 3 for Weak-shot Semantic Segmentation via Dual Similarity Transfer

Figure 4 for Weak-shot Semantic Segmentation via Dual Similarity Transfer

Abstract:Semantic segmentation is an important and prevalent task, but severely suffers from the high cost of pixel-level annotations when extending to more classes in wider applications. To this end, we focus on the problem named weak-shot semantic segmentation, where the novel classes are learnt from cheaper image-level labels with the support of base classes having off-the-shelf pixel-level labels. To tackle this problem, we propose SimFormer, which performs dual similarity transfer upon MaskFormer. Specifically, MaskFormer disentangles the semantic segmentation task into two sub-tasks: proposal classification and proposal segmentation for each proposal. Proposal segmentation allows proposal-pixel similarity transfer from base classes to novel classes, which enables the mask learning of novel classes. We also learn pixel-pixel similarity from base classes and distill such class-agnostic semantic similarity to the semantic masks of novel classes, which regularizes the segmentation model with pixel-level semantic relationship across images. In addition, we propose a complementary loss to facilitate the learning of novel classes. Comprehensive experiments on the challenging COCO-Stuff-10K and ADE20K datasets demonstrate the effectiveness of our method. Codes are available at https://github.com/bcmi/SimFormer-Weak-Shot-Semantic-Segmentation.

* accepted by NeurIPS2022

Via

Access Paper or Ask Questions

Inharmonious Region Localization via Recurrent Self-Reasoning

Oct 05, 2022

Penghao Wu, Li Niu, Jing Liang, Liqing Zhang

Figure 1 for Inharmonious Region Localization via Recurrent Self-Reasoning

Figure 2 for Inharmonious Region Localization via Recurrent Self-Reasoning

Figure 3 for Inharmonious Region Localization via Recurrent Self-Reasoning

Figure 4 for Inharmonious Region Localization via Recurrent Self-Reasoning

Abstract:Synthetic images created by image editing operations are prevalent, but the color or illumination inconsistency between the manipulated region and background may make it unrealistic. Thus, it is important yet challenging to localize the inharmonious region to improve the quality of synthetic image. Inspired by the classic clustering algorithm, we aim to group pixels into two clusters: inharmonious cluster and background cluster by inserting a novel Recurrent Self-Reasoning (RSR) module into the bottleneck of UNet structure. The mask output from RSR module is provided for the decoder as attention guidance. Finally, we adaptively combine the masks from RSR and the decoder to form our final mask. Experimental results on the image harmonization dataset demonstrate that our method achieves competitive performance both quantitatively and qualitatively.

* BMVC2022

Via

Access Paper or Ask Questions

Inharmonious Region Localization with Auxiliary Style Feature

Oct 05, 2022

Penghao Wu, Li Niu, Liqing Zhang

Figure 1 for Inharmonious Region Localization with Auxiliary Style Feature

Figure 2 for Inharmonious Region Localization with Auxiliary Style Feature

Figure 3 for Inharmonious Region Localization with Auxiliary Style Feature

Figure 4 for Inharmonious Region Localization with Auxiliary Style Feature

Abstract:With the prevalence of image editing techniques, users can create fantastic synthetic images, but the image quality may be compromised by the color/illumination discrepancy between the manipulated region and background. Inharmonious region localization aims to localize the inharmonious region in a synthetic image. In this work, we attempt to leverage auxiliary style feature to facilitate this task. Specifically, we propose a novel color mapping module and a style feature loss to extract discriminative style features containing task-relevant color/illumination information. Based on the extracted style features, we also propose a novel style voting module to guide the localization of inharmonious region. Moreover, we introduce semantic information into the style voting module to achieve further improvement. Our method surpasses the existing methods by a large margin on the benchmark dataset.

* BMVC2022

Via

Access Paper or Ask Questions

DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta

Jul 28, 2022

Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang

Figure 1 for DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta

Figure 2 for DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta

Figure 3 for DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta

Figure 4 for DeltaGAN: Towards Diverse Few-shot Image Generation with Sample-Specific Delta

Abstract:Learning to generate new images for a novel category based on only a few images, named as few-shot image generation, has attracted increasing research interest. Several state-of-the-art works have yielded impressive results, but the diversity is still limited. In this work, we propose a novel Delta Generative Adversarial Network (DeltaGAN), which consists of a reconstruction subnetwork and a generation subnetwork. The reconstruction subnetwork captures intra-category transformation, i.e., delta, between same-category pairs. The generation subnetwork generates sample-specific delta for an input image, which is combined with this input image to generate a new image within the same category. Besides, an adversarial delta matching loss is designed to link the above two subnetworks together. Extensive experiments on six benchmark datasets demonstrate the effectiveness of our proposed method. Our code is available at https://github.com/bcmi/DeltaGAN-Few-Shot-Image-Generation.

* I want to withdraw this version and use it to update the previous version at arXiv:2009.08753

Via

Access Paper or Ask Questions

Learning Object Placement via Dual-path Graph Completion

Jul 23, 2022

Siyuan Zhou, Liu Liu, Li Niu, Liqing Zhang

Figure 1 for Learning Object Placement via Dual-path Graph Completion

Figure 2 for Learning Object Placement via Dual-path Graph Completion

Figure 3 for Learning Object Placement via Dual-path Graph Completion

Figure 4 for Learning Object Placement via Dual-path Graph Completion

Abstract:Object placement aims to place a foreground object over a background image with a suitable location and size. In this work, we treat object placement as a graph completion problem and propose a novel graph completion module (GCM). The background scene is represented by a graph with multiple nodes at different spatial locations with various receptive fields. The foreground object is encoded as a special node that should be inserted at a reasonable place in this graph. We also design a dual-path framework upon the structure of GCM to fully exploit annotated composite images. With extensive experiments on OPA dataset, our method proves to significantly outperform existing methods in generating plausible object placement without loss of diversity.

* 25 pages, 9 figures

Via

Access Paper or Ask Questions

Few-shot Image Generation Using Discrete Content Representation

Jul 22, 2022

Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang

Figure 1 for Few-shot Image Generation Using Discrete Content Representation

Figure 2 for Few-shot Image Generation Using Discrete Content Representation

Figure 3 for Few-shot Image Generation Using Discrete Content Representation

Figure 4 for Few-shot Image Generation Using Discrete Content Representation

Abstract:Few-shot image generation and few-shot image translation are two related tasks, both of which aim to generate new images for an unseen category with only a few images. In this work, we make the first attempt to adapt few-shot image translation method to few-shot image generation task. Few-shot image translation disentangles an image into style vector and content map. An unseen style vector can be combined with different seen content maps to produce different images. However, it needs to store seen images to provide content maps and the unseen style vector may be incompatible with seen content maps. To adapt it to few-shot image generation task, we learn a compact dictionary of local content vectors via quantizing continuous content maps into discrete content maps instead of storing seen images. Furthermore, we model the autoregressive distribution of discrete content map conditioned on style vector, which can alleviate the incompatibility between content map and style vector. Qualitative and quantitative results on three real datasets demonstrate that our model can produce images of higher diversity and fidelity for unseen categories than previous methods.

* This paper is accepted by ACM MM 2022

Via

Access Paper or Ask Questions