Alert button
Picture for Xingyu Wan

Xingyu Wan

Alert button

Auxiliary Loss Adaptation for Image Inpainting

Nov 22, 2021
Siqi Hui, Sanping Zhou, Xingyu Wan, Jixin Wang, Ye Deng, Yang Wu, Zhenghao Gong, Jinjun Wang

Figure 1 for Auxiliary Loss Adaptation for Image Inpainting
Figure 2 for Auxiliary Loss Adaptation for Image Inpainting
Figure 3 for Auxiliary Loss Adaptation for Image Inpainting
Figure 4 for Auxiliary Loss Adaptation for Image Inpainting

Auxiliary losses commonly used in image inpainting lead to better reconstruction performance by incorporating prior knowledge of missing regions. However, it usually requires a lot of effort to fully exploit the potential of auxiliary losses, or otherwise, improperly weighted auxiliary losses would distract the model from the inpainting task, and the effectiveness of an auxiliary loss might vary during the training process. Hence the design of auxiliary losses takes strong domain expertise. To mitigate the problem, in this work, we introduce the Auxiliary Loss Adaptation for Image Inpainting (ALA) algorithm to dynamically adjust the parameters of the auxiliary loss. Our method is based on the principle that the best auxiliary loss is the one that helps increase the performance of the main loss most through several steps of gradient descent. We then examined two commonly used auxiliary losses in inpainting and used ALA to adapt their parameters. Experimental results show that ALA induces more competitive inpainting results than fixed auxiliary losses. In particular, simply combining auxiliary loss with ALA, existing inpainting methods can achieve increased performances without explicitly incorporating delicate network design or structure knowledge prior.

Viaarxiv icon

Auxiliary Loss Adaption for Image Inpainting

Nov 14, 2021
Siqi Hui, Sanping Zhou, Xingyu Wan, Jixin Wang, Ye Deng, Yang Wu, Zhenghao Gong, Jinjun Wang

Figure 1 for Auxiliary Loss Adaption for Image Inpainting
Figure 2 for Auxiliary Loss Adaption for Image Inpainting
Figure 3 for Auxiliary Loss Adaption for Image Inpainting
Figure 4 for Auxiliary Loss Adaption for Image Inpainting

Auxiliary losses commonly used in image inpainting lead to better reconstruction performance by incorporating prior knowledge of missing regions. However, it usually takes a lot of effort to fully exploit the potential of auxiliary losses, since improperly weighted auxiliary losses would distract the model from the inpainting task, and the effectiveness of an auxiliary loss might vary during the training process. Furthermore, the design of auxiliary losses takes domain expertise. In this work, we introduce the Auxiliary Loss Adaption (Adaption) algorithm to dynamically adjust the parameters of the auxiliary loss, to better assist the primary task. Our algorithm is based on the principle that better auxiliary loss is the one that helps increase the performance of the main loss through several steps of gradient descent. We then examined two commonly used auxiliary losses in inpainting and use \ac{ALA} to adapt their parameters. Experimental results show that ALA induces more competitive inpainting results than fixed auxiliary losses. In particular, simply combining auxiliary loss with \ac{ALA}, existing inpainting methods can achieve increased performances without explicitly incorporating delicate network design or structure knowledge prior.

Viaarxiv icon

Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection

Dec 12, 2020
Rongye Meng, Sanping Zhou, Xingyu Wan, Mengliu Li, Jinjun Wang

Figure 1 for Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection
Figure 2 for Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection
Figure 3 for Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection
Figure 4 for Teacher-Student Asynchronous Learning with Multi-Source Consistency for Facial Landmark Detection

Due to the high annotation cost of large-scale facial landmark detection tasks in videos, a semi-supervised paradigm that uses self-training for mining high-quality pseudo-labels to participate in training has been proposed by researchers. However, self-training based methods often train with a gradually increasing number of samples, whose performances vary a lot depending on the number of pseudo-labeled samples added. In this paper, we propose a teacher-student asynchronous learning~(TSAL) framework based on the multi-source supervision signal consistency criterion, which implicitly mines pseudo-labels through consistency constraints. Specifically, the TSAL framework contains two models with exactly the same structure. The radical student uses multi-source supervision signals from the same task to update parameters, while the calm teacher uses a single-source supervision signal to update parameters. In order to reasonably absorb student's suggestions, teacher's parameters are updated again through recursive average filtering. The experimental results prove that asynchronous-learning framework can effectively filter noise in multi-source supervision signals, thereby mining the pseudo-labels which are more significant for network parameter updating. And extensive experiments on 300W, AFLW, and 300VW benchmarks show that the TSAL framework achieves state-of-the-art performance.

* second version 
Viaarxiv icon

End-to-End Multi-Object Tracking with Global Response Map

Jul 13, 2020
Xingyu Wan, Jiakai Cao, Sanping Zhou, Jinjun Wang

Figure 1 for End-to-End Multi-Object Tracking with Global Response Map
Figure 2 for End-to-End Multi-Object Tracking with Global Response Map
Figure 3 for End-to-End Multi-Object Tracking with Global Response Map
Figure 4 for End-to-End Multi-Object Tracking with Global Response Map

Most existing Multi-Object Tracking (MOT) approaches follow the Tracking-by-Detection paradigm and the data association framework where objects are firstly detected and then associated. Although deep-learning based method can noticeably improve the object detection performance and also provide good appearance features for cross-frame association, the framework is not completely end-to-end, and therefore the computation is huge while the performance is limited. To address the problem, we present a completely end-to-end approach that takes image-sequence/video as input and outputs directly the located and tracked objects of learned types. Specifically, with our introduced multi-object representation strategy, a global response map can be accurately generated over frames, from which the trajectory of each tracked object can be easily picked up, just like how a detector inputs an image and outputs the bounding boxes of each detected object. The proposed model is fast and accurate. Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieved state-of-the-art performance on several tracking metrics.

Viaarxiv icon