Alert button
Picture for Xiaoshuang Chen

Xiaoshuang Chen

Alert button

Combining Past, Present and Future: A Self-Supervised Approach for Class Incremental Learning

Nov 15, 2023
Xiaoshuang Chen, Zhongyi Sun, Ke Yan, Shouhong Ding, Hongtao Lu

Class Incremental Learning (CIL) aims to handle the scenario where data of novel classes occur continuously and sequentially. The model should recognize the sequential novel classes while alleviating the catastrophic forgetting. In the self-supervised manner, it becomes more challenging to avoid the conflict between the feature embedding spaces of novel classes and old ones without any class labels. To address the problem, we propose a self-supervised CIL framework CPPF, meaning Combining Past, Present and Future. In detail, CPPF consists of a prototype clustering module (PC), an embedding space reserving module (ESR) and a multi-teacher distillation module (MTD). 1) The PC and the ESR modules reserve embedding space for subsequent phases at the prototype level and the feature level respectively to prepare for knowledge learned in the future. 2) The MTD module maintains the representations of the current phase without the interference of past knowledge. One of the teacher networks retains the representations of the past phases, and the other teacher network distills relation information of the current phase to the student network. Extensive experiments on CIFAR100 and ImageNet100 datasets demonstrate that our proposed method boosts the performance of self-supervised class incremental learning. We will release code in the near future.

Viaarxiv icon

Reinforcing Local Feature Representation for Weakly-Supervised Dense Crowd Counting

Feb 22, 2022
Xiaoshuang Chen, Hongtao Lu

Figure 1 for Reinforcing Local Feature Representation for Weakly-Supervised Dense Crowd Counting
Figure 2 for Reinforcing Local Feature Representation for Weakly-Supervised Dense Crowd Counting
Figure 3 for Reinforcing Local Feature Representation for Weakly-Supervised Dense Crowd Counting
Figure 4 for Reinforcing Local Feature Representation for Weakly-Supervised Dense Crowd Counting

Fully-supervised crowd counting is a laborious task due to the large amounts of annotations. Few works focus on weekly-supervised crowd counting, where only the global crowd numbers are available for training. The main challenge of weekly-supervised crowd counting is the lack of local supervision information. To address this problem, we propose a self-adaptive feature similarity learning (SFSL) network and a global-local consistency (GLC) loss to reinforce local feature representation. We introduce a feature vector which represents the unbiased feature estimation of persons. The network updates the feature vector self-adaptively and utilizes the feature similarity for the regression of crowd numbers. Besides, the proposed GLC loss leverages the consistency between the network estimations from global and local areas. The experimental results demonstrate that our proposed method based on different backbones narrows the gap between weakly-supervised and fully-supervised dense crowd counting.

Viaarxiv icon

PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting

Oct 31, 2021
Xiaoshuang Chen, Yiru Zhao, Yu Qin, Fei Jiang, Mingyuan Tao, Xiansheng Hua, Hongtao Lu

Figure 1 for PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting
Figure 2 for PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting
Figure 3 for PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting
Figure 4 for PANet: Perspective-Aware Network with Dynamic Receptive Fields and Self-Distilling Supervision for Crowd Counting

Crowd counting aims to learn the crowd density distributions and estimate the number of objects (e.g. persons) in images. The perspective effect, which significantly influences the distribution of data points, plays an important role in crowd counting. In this paper, we propose a novel perspective-aware approach called PANet to address the perspective problem. Based on the observation that the size of the objects varies greatly in one image due to the perspective effect, we propose the dynamic receptive fields (DRF) framework. The framework is able to adjust the receptive field by the dilated convolution parameters according to the input image, which helps the model to extract more discriminative features for each local region. Different from most previous works which use Gaussian kernels to generate the density map as the supervised information, we propose the self-distilling supervision (SDS) training method. The ground-truth density maps are refined from the first training stage and the perspective information is distilled to the model in the second stage. The experimental results on ShanghaiTech Part_A and Part_B, UCF_QNRF, and UCF_CC_50 datasets demonstrate that our proposed PANet outperforms the state-of-the-art methods by a large margin.

Viaarxiv icon

Semi-supervised Learning with Contrastive Predicative Coding

May 25, 2019
Jiaxing Wang, Yin Zheng, Xiaoshuang Chen, Junzhou Huang, Jian Cheng

Figure 1 for Semi-supervised Learning with Contrastive Predicative Coding
Figure 2 for Semi-supervised Learning with Contrastive Predicative Coding
Figure 3 for Semi-supervised Learning with Contrastive Predicative Coding
Figure 4 for Semi-supervised Learning with Contrastive Predicative Coding

Semi-supervised learning (SSL) provides a powerful framework for leveraging unlabeled data when labels are limited or expensive to obtain. SSL algorithms based on deep neural networks have recently proven successful on standard benchmark tasks. However, many of them have thus far been either inflexible, inefficient or non-scalable. This paper explores recently developed contrastive predictive coding technique to improve discriminative power of deep learning models when a large portion of labels are absent. Two models, cpc-SSL and a class conditional variant~(ccpc-SSL) are presented. They effectively exploit the unlabeled data by extracting shared information between different parts of the (high-dimensional) data. The proposed approaches are inductive, and scale well to very large datasets like ImageNet, making them good candidates in real-world large scale applications.

* 6 pages, 4 figures, conference 
Viaarxiv icon

RaFM: Rank-Aware Factorization Machines

May 18, 2019
Xiaoshuang Chen, Yin Zheng, Jiaxing Wang, Wenye Ma, Junzhou Huang

Figure 1 for RaFM: Rank-Aware Factorization Machines
Figure 2 for RaFM: Rank-Aware Factorization Machines
Figure 3 for RaFM: Rank-Aware Factorization Machines
Figure 4 for RaFM: Rank-Aware Factorization Machines

Factorization machines (FM) are a popular model class to learn pairwise interactions by a low-rank approximation. Different from existing FM-based approaches which use a fixed rank for all features, this paper proposes a Rank-Aware FM (RaFM) model which adopts pairwise interactions from embeddings with different ranks. The proposed model achieves a better performance on real-world datasets where different features have significantly varying frequencies of occurrences. Moreover, we prove that the RaFM model can be stored, evaluated, and trained as efficiently as one single FM, and under some reasonable conditions it can be even significantly more efficient than FM. RaFM improves the performance of FMs in both regression tasks and classification tasks while incurring less computational burden, therefore also has attractive potential in industrial applications.

* 9 pages, 4 figures, accepted by ICML 2019 
Viaarxiv icon