Alert button
Picture for Pengfei Chen

Pengfei Chen

Alert button

Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes

Aug 15, 2023
Di Wu, Pengfei Chen, Xuehui Yu, Guorong Li, Zhenjun Han, Jianbin Jiao

Figure 1 for Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes
Figure 2 for Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes
Figure 3 for Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes
Figure 4 for Spatial Self-Distillation for Object Detection with Inaccurate Bounding Boxes

Object detection via inaccurate bounding boxes supervision has boosted a broad interest due to the expensive high-quality annotation data or the occasional inevitability of low annotation quality (\eg tiny objects). The previous works usually utilize multiple instance learning (MIL), which highly depends on category information, to select and refine a low-quality box. Those methods suffer from object drift, group prediction and part domination problems without exploring spatial information. In this paper, we heuristically propose a \textbf{Spatial Self-Distillation based Object Detector (SSD-Det)} to mine spatial information to refine the inaccurate box in a self-distillation fashion. SSD-Det utilizes a Spatial Position Self-Distillation \textbf{(SPSD)} module to exploit spatial information and an interactive structure to combine spatial information and category information, thus constructing a high-quality proposal bag. To further improve the selection procedure, a Spatial Identity Self-Distillation \textbf{(SISD)} module is introduced in SSD-Det to obtain spatial confidence to help select the best proposals. Experiments on MS-COCO and VOC datasets with noisy box annotation verify our method's effectiveness and achieve state-of-the-art performance. The code is available at https://github.com/ucas-vg/PointTinyBenchmark/tree/SSD-Det.

* accepted by ICCV 2023 
Viaarxiv icon

HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning Framework for Data Analytic Services

Apr 20, 2023
Hui Dou, Shanshan Zhu, Yiwen Zhang, Pengfei Chen, Zibin Zheng

Figure 1 for HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning Framework for Data Analytic Services
Figure 2 for HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning Framework for Data Analytic Services
Figure 3 for HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning Framework for Data Analytic Services
Figure 4 for HyperTuner: A Cross-Layer Multi-Objective Hyperparameter Auto-Tuning Framework for Data Analytic Services

Hyper-parameters optimization (HPO) is vital for machine learning models. Besides model accuracy, other tuning intentions such as model training time and energy consumption are also worthy of attention from data analytic service providers. Hence, it is essential to take both model hyperparameters and system parameters into consideration to execute cross-layer multi-objective hyperparameter auto-tuning. Towards this challenging target, we propose HyperTuner in this paper. To address the formulated high-dimensional black-box multi-objective optimization problem, HyperTuner first conducts multi-objective parameter importance ranking with its MOPIR algorithm and then leverages the proposed ADUMBO algorithm to find the Pareto-optimal configuration set. During each iteration, ADUMBO selects the most promising configuration from the generated Pareto candidate set via maximizing a new well-designed metric, which can adaptively leverage the uncertainty as well as the predicted mean across all the surrogate models along with the iteration times. We evaluate HyperTuner on our local distributed TensorFlow cluster and experimental results show that it is always able to find a better Pareto configuration front superior in both convergence and diversity compared with the other four baseline algorithms. Besides, experiments with different training datasets, different optimization objectives and different machine learning platforms verify that HyperTuner can well adapt to various data analytic service scenarios.

Viaarxiv icon

Point-to-Box Network for Accurate Object Detection via Single Point Supervision

Jul 14, 2022
Pengfei Chen, Xuehui Yu, Xumeng Han, Najmul Hassan, Kai Wang, Jiachen Li, Jian Zhao, Humphrey Shi, Zhenjun Han, Qixiang Ye

Figure 1 for Point-to-Box Network for Accurate Object Detection via Single Point Supervision
Figure 2 for Point-to-Box Network for Accurate Object Detection via Single Point Supervision
Figure 3 for Point-to-Box Network for Accurate Object Detection via Single Point Supervision
Figure 4 for Point-to-Box Network for Accurate Object Detection via Single Point Supervision

Object detection using single point supervision has received increasing attention over the years. In this paper, we attribute such a large performance gap to the failure of generating high-quality proposal bags which are crucial for multiple instance learning (MIL). To address this problem, we introduce a lightweight alternative to the off-the-shelf proposal (OTSP) method and thereby create the Point-to-Box Network (P2BNet), which can construct an inter-objects balanced proposal bag by generating proposals in an anchor-like way. By fully investigating the accurate position information, P2BNet further constructs an instance-level bag, avoiding the mixture of multiple objects. Finally, a coarse-to-fine policy in a cascade fashion is utilized to improve the IoU between proposals and ground-truth (GT). Benefiting from these strategies, P2BNet is able to produce high-quality instance-level bags for object detection. P2BNet improves the mean average precision (AP) by more than 50% relative to the previous best PSOD method on the MS COCO dataset. It also demonstrates the great potential to bridge the performance gap between point supervised and bounding-box supervised detectors. The code will be released at github.com/ucas-vg/P2BNet.

* Accepted by ECCV2022 
Viaarxiv icon

Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training

May 10, 2022
Cheng Xue, Lequan Yu, Pengfei Chen, Qi Dou, Pheng-Ann Heng

Figure 1 for Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training
Figure 2 for Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training
Figure 3 for Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training
Figure 4 for Robust Medical Image Classification from Noisy Labeled Data with Global and Local Representation Guided Co-training

Deep neural networks have achieved remarkable success in a wide variety of natural image and medical image computing tasks. However, these achievements indispensably rely on accurately annotated training data. If encountering some noisy-labeled images, the network training procedure would suffer from difficulties, leading to a sub-optimal classifier. This problem is even more severe in the medical image analysis field, as the annotation quality of medical images heavily relies on the expertise and experience of annotators. In this paper, we propose a novel collaborative training paradigm with global and local representation learning for robust medical image classification from noisy-labeled data to combat the lack of high quality annotated medical data. Specifically, we employ the self-ensemble model with a noisy label filter to efficiently select the clean and noisy samples. Then, the clean samples are trained by a collaborative training strategy to eliminate the disturbance from imperfect labeled samples. Notably, we further design a novel global and local representation learning scheme to implicitly regularize the networks to utilize noisy samples in a self-supervised manner. We evaluated our proposed robust learning strategy on four public medical image classification datasets with three types of label noise,ie,random noise, computer-generated label noise, and inter-observer variability noise. Our method outperforms other learning from noisy label methods and we also conducted extensive experiments to analyze each component of our method.

Viaarxiv icon

Acknowledging the Unknown for Multi-label Learning with Single Positive Labels

Mar 30, 2022
Donghao Zhou, Pengfei Chen, Qiong Wang, Guangyong Chen, Pheng-Ann Heng

Figure 1 for Acknowledging the Unknown for Multi-label Learning with Single Positive Labels
Figure 2 for Acknowledging the Unknown for Multi-label Learning with Single Positive Labels
Figure 3 for Acknowledging the Unknown for Multi-label Learning with Single Positive Labels
Figure 4 for Acknowledging the Unknown for Multi-label Learning with Single Positive Labels

Due to the difficulty of collecting exhaustive multi-label annotations, multi-label training data often contains partial labels. We consider an extreme of this problem, called single positive multi-label learning (SPML), where each multi-label training image has only one positive label. Traditionally, all unannotated labels are assumed as negative labels in SPML, which would introduce false negative labels and make model training be dominated by assumed negative labels. In this work, we choose to treat all unannotated labels from a different perspective, \textit{i.e.} acknowledging they are unknown. Hence, we propose entropy-maximization (EM) loss to maximize the entropy of predicted probabilities for all unannotated labels. Considering the positive-negative label imbalance of unannotated labels, we propose asymmetric pseudo-labeling (APL) with asymmetric-tolerance strategies and a self-paced procedure to provide more precise supervision. Experiments show that our method significantly improves performance and achieves state-of-the-art results on all four benchmarks.

* 26 pages, 11 figures 
Viaarxiv icon

Object Localization under Single Coarse Point Supervision

Mar 17, 2022
Xuehui Yu, Pengfei Chen, Di Wu, Najmul Hassan, Guorong Li, Junchi Yan, Humphrey Shi, Qixiang Ye, Zhenjun Han

Figure 1 for Object Localization under Single Coarse Point Supervision
Figure 2 for Object Localization under Single Coarse Point Supervision
Figure 3 for Object Localization under Single Coarse Point Supervision
Figure 4 for Object Localization under Single Coarse Point Supervision

Point-based object localization (POL), which pursues high-performance object sensing under low-cost data annotation, has attracted increased attention. However, the point annotation mode inevitably introduces semantic variance for the inconsistency of annotated points. Existing POL methods heavily reply on accurate key-point annotations which are difficult to define. In this study, we propose a POL method using coarse point annotations, relaxing the supervision signals from accurate key points to freely spotted points. To this end, we propose a coarse point refinement (CPR) approach, which to our best knowledge is the first attempt to alleviate semantic variance from the perspective of algorithm. CPR constructs point bags, selects semantic-correlated points, and produces semantic center points through multiple instance learning (MIL). In this way, CPR defines a weakly supervised evolution procedure, which ensures training high-performance object localizer under coarse point supervision. Experimental results on COCO, DOTA and our proposed SeaPerson dataset validate the effectiveness of the CPR approach. The dataset and code will be available at https://github.com/ucas-vg/PointTinyBenchmark/.

* accepted by CVPR2022 
Viaarxiv icon

Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction

Mar 03, 2021
Hongyao Tang, Jianye Hao, Guangyong Chen, Pengfei Chen, Chen Chen, Yaodong Yang, Luo Zhang, Wulong Liu, Zhaopeng Meng

Figure 1 for Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Figure 2 for Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Figure 3 for Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction
Figure 4 for Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction

Value function is the central notion of Reinforcement Learning (RL). Value estimation, especially with function approximation, can be challenging since it involves the stochasticity of environmental dynamics and reward signals that can be sparse and delayed in some cases. A typical model-free RL algorithm usually estimates the values of a policy by Temporal Difference (TD) or Monte Carlo (MC) algorithms directly from rewards, without explicitly taking dynamics into consideration. In this paper, we propose Value Decomposition with Future Prediction (VDFP), providing an explicit two-step understanding of the value estimation process: 1) first foresee the latent future, 2) and then evaluate it. We analytically decompose the value function into a latent future dynamics part and a policy-independent trajectory return part, inducing a way to model latent dynamics and returns separately in value estimation. Further, we derive a practical deep RL algorithm, consisting of a convolutional model to learn compact trajectory representation from past experiences, a conditional variational auto-encoder to predict the latent future dynamics and a convex return model that evaluates trajectory representation. In experiments, we empirically demonstrate the effectiveness of our approach for both off-policy and on-policy RL in several OpenAI Gym continuous control tasks as well as a few challenging variants with delayed reward.

* Accepted paper on AAAI 2021. arXiv admin note: text overlap with arXiv:1905.11100 
Viaarxiv icon

Improving Graph Representation Learning by Contrastive Regularization

Jan 27, 2021
Kaili Ma, Haochen Yang, Han Yang, Tatiana Jin, Pengfei Chen, Yongqiang Chen, Barakeel Fanseu Kamhoua, James Cheng

Figure 1 for Improving Graph Representation Learning by Contrastive Regularization
Figure 2 for Improving Graph Representation Learning by Contrastive Regularization
Figure 3 for Improving Graph Representation Learning by Contrastive Regularization
Figure 4 for Improving Graph Representation Learning by Contrastive Regularization

Graph representation learning is an important task with applications in various areas such as online social networks, e-commerce networks, WWW, and semantic webs. For unsupervised graph representation learning, many algorithms such as Node2Vec and Graph-SAGE make use of "negative sampling" and/or noise contrastive estimation loss. This bears similar ideas to contrastive learning, which "contrasts" the node representation similarities of semantically similar (positive) pairs against those of negative pairs. However, despite the success of contrastive learning, we found that directly applying this technique to graph representation learning models (e.g., graph convolutional networks) does not always work. We theoretically analyze the generalization performance and propose a light-weight regularization term that avoids the high scales of node representations' norms and the high variance among them to improve the generalization performance. Our experimental results further validate that this regularization term significantly improves the representation quality across different node similarity definitions and outperforms the state-of-the-art methods.

Viaarxiv icon

Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise

Dec 10, 2020
Pengfei Chen, Junjie Ye, Guangyong Chen, Jingwei Zhao, Pheng-Ann Heng

Figure 1 for Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise
Figure 2 for Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise
Figure 3 for Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise
Figure 4 for Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise

Supervised learning under label noise has seen numerous advances recently, while existing theoretical findings and empirical results broadly build up on the class-conditional noise (CCN) assumption that the noise is independent of input features given the true label. In this work, we present a theoretical hypothesis testing and prove that noise in real-world dataset is unlikely to be CCN, which confirms that label noise should depend on the instance and justifies the urgent need to go beyond the CCN assumption.The theoretical results motivate us to study the more general and practical-relevant instance-dependent noise (IDN). To stimulate the development of theory and methodology on IDN, we formalize an algorithm to generate controllable IDN and present both theoretical and empirical evidence to show that IDN is semantically meaningful and challenging. As a primary attempt to combat IDN, we present a tiny algorithm termed self-evolution average label (SEAL), which not only stands out under IDN with various noise fractions, but also improves the generalization on real-world noise benchmark Clothing1M. Our code is released. Notably, our theoretical analysis in Section 2 provides rigorous motivations for studying IDN, which is an important topic that deserves more research attention in future.

* Accepted by AAAI 2021 
Viaarxiv icon