Alert button
Picture for Xinghao Ding

Xinghao Ding

Alert button

Double Normalizing Flows: Flexible Bayesian Gaussian Process ODEs Learning

Sep 17, 2023
Jian Xu, Shian Du, Junmei Yang, Xinghao Ding, John Paisley, Delu Zeng

Figure 1 for Double Normalizing Flows: Flexible Bayesian Gaussian Process ODEs Learning
Figure 2 for Double Normalizing Flows: Flexible Bayesian Gaussian Process ODEs Learning
Figure 3 for Double Normalizing Flows: Flexible Bayesian Gaussian Process ODEs Learning
Figure 4 for Double Normalizing Flows: Flexible Bayesian Gaussian Process ODEs Learning

Recently, Gaussian processes have been utilized to model the vector field of continuous dynamical systems. Bayesian inference for such models \cite{hegde2022variational} has been extensively studied and has been applied in tasks such as time series prediction, providing uncertain estimates. However, previous Gaussian Process Ordinary Differential Equation (ODE) models may underperform on datasets with non-Gaussian process priors, as their constrained priors and mean-field posteriors may lack flexibility. To address this limitation, we incorporate normalizing flows to reparameterize the vector field of ODEs, resulting in a more flexible and expressive prior distribution. Additionally, due to the analytically tractable probability density functions of normalizing flows, we apply them to the posterior inference of GP ODEs, generating a non-Gaussian posterior. Through these dual applications of normalizing flows, our model improves accuracy and uncertainty estimates for Bayesian Gaussian Process ODEs. The effectiveness of our approach is demonstrated on simulated dynamical systems and real-world human motion data, including tasks such as time series prediction and missing data recovery. Experimental results indicate that our proposed method effectively captures model uncertainty while improving accuracy.

Viaarxiv icon

SRCD: Semantic Reasoning with Compound Domains for Single-Domain Generalized Object Detection

Jul 09, 2023
Zhijie Rao, Jingcai Guo, Luyao Tang, Yue Huang, Xinghao Ding, Song Guo

Figure 1 for SRCD: Semantic Reasoning with Compound Domains for Single-Domain Generalized Object Detection
Figure 2 for SRCD: Semantic Reasoning with Compound Domains for Single-Domain Generalized Object Detection
Figure 3 for SRCD: Semantic Reasoning with Compound Domains for Single-Domain Generalized Object Detection
Figure 4 for SRCD: Semantic Reasoning with Compound Domains for Single-Domain Generalized Object Detection

This paper provides a novel framework for single-domain generalized object detection (i.e., Single-DGOD), where we are interested in learning and maintaining the semantic structures of self-augmented compound cross-domain samples to enhance the model's generalization ability. Different from DGOD trained on multiple source domains, Single-DGOD is far more challenging to generalize well to multiple target domains with only one single source domain. Existing methods mostly adopt a similar treatment from DGOD to learn domain-invariant features by decoupling or compressing the semantic space. However, there may have two potential limitations: 1) pseudo attribute-label correlation, due to extremely scarce single-domain data; and 2) the semantic structural information is usually ignored, i.e., we found the affinities of instance-level semantic relations in samples are crucial to model generalization. In this paper, we introduce Semantic Reasoning with Compound Domains (SRCD) for Single-DGOD. Specifically, our SRCD contains two main components, namely, the texture-based self-augmentation (TBSA) module, and the local-global semantic reasoning (LGSR) module. TBSA aims to eliminate the effects of irrelevant attributes associated with labels, such as light, shadow, color, etc., at the image level by a light-yet-efficient self-augmentation. Moreover, LGSR is used to further model the semantic relationships on instance features to uncover and maintain the intrinsic semantic structures. Extensive experiments on multiple benchmarks demonstrate the effectiveness of the proposed SRCD.

* 9 pages, 5 figures 
Viaarxiv icon

Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming

Jun 19, 2023
Hao Liang, Guanxing Zhou, Xiaotong Tu, Andreas Jakobsson, Xinghao Ding, Yue Huang

Figure 1 for Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming
Figure 2 for Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming
Figure 3 for Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming
Figure 4 for Learning an Interpretable End-to-End Network for Real-Time Acoustic Beamforming

Recently, many forms of audio industrial applications, such as sound monitoring and source localization, have begun exploiting smart multi-modal devices equipped with a microphone array. Regrettably, model-based methods are often difficult to employ for such devices due to their high computational complexity, as well as the difficulty of appropriately selecting the user-determined parameters. As an alternative, one may use deep network-based methods, but these are often difficult to generalize, nor can they generate the desired beamforming map directly. In this paper, a computationally efficient acoustic beamforming algorithm is proposed, which may be unrolled to form a model-based deep learning network for real-time imaging, here termed the DAMAS-FISTA-Net. By exploiting the natural structure of an acoustic beamformer, the proposed network inherits the physical knowledge of the acoustic system, and thus learns the underlying physical properties of the propagation. As a result, all the network parameters may be learned end-to-end, guided by a model-based prior using back-propagation. Notably, the proposed network enables an excellent interpretability and the ability of being able to process the raw data directly. Extensive numerical experiments using both simulated and real-world data illustrate the preferable performance of the DAMAS-FISTA-Net as compared to alternative approaches.

* 12 pages, 9 figures 
Viaarxiv icon

Hint-dynamic Knowledge Distillation

Nov 30, 2022
Yiyang Liu, Chenxin Li, Xiaotong Tu, Xinghao Ding, Yue Huang

Figure 1 for Hint-dynamic Knowledge Distillation
Figure 2 for Hint-dynamic Knowledge Distillation
Figure 3 for Hint-dynamic Knowledge Distillation
Figure 4 for Hint-dynamic Knowledge Distillation

Knowledge Distillation (KD) transfers the knowledge from a high-capacity teacher model to promote a smaller student model. Existing efforts guide the distillation by matching their prediction logits, feature embedding, etc., while leaving how to efficiently utilize them in junction less explored. In this paper, we propose Hint-dynamic Knowledge Distillation, dubbed HKD, which excavates the knowledge from the teacher' s hints in a dynamic scheme. The guidance effect from the knowledge hints usually varies in different instances and learning stages, which motivates us to customize a specific hint-learning manner for each instance adaptively. Specifically, a meta-weight network is introduced to generate the instance-wise weight coefficients about knowledge hints in the perception of the dynamical learning progress of the student model. We further present a weight ensembling strategy to eliminate the potential bias of coefficient estimation by exploiting the historical statics. Experiments on standard benchmarks of CIFAR-100 and Tiny-ImageNet manifest that the proposed HKD well boost the effect of knowledge distillation tasks.

* 5 pages 
Viaarxiv icon

Uncertainty Inspired Underwater Image Enhancement

Jul 20, 2022
Zhenqi Fu, Wu Wang, Yue Huang, Xinghao Ding, Kai-Kuang Ma

Figure 1 for Uncertainty Inspired Underwater Image Enhancement
Figure 2 for Uncertainty Inspired Underwater Image Enhancement
Figure 3 for Uncertainty Inspired Underwater Image Enhancement
Figure 4 for Uncertainty Inspired Underwater Image Enhancement

A main challenge faced in the deep learning-based Underwater Image Enhancement (UIE) is that the ground truth high-quality image is unavailable. Most of the existing methods first generate approximate reference maps and then train an enhancement network with certainty. This kind of method fails to handle the ambiguity of the reference map. In this paper, we resolve UIE into distribution estimation and consensus process. We present a novel probabilistic network to learn the enhancement distribution of degraded underwater images. Specifically, we combine conditional variational autoencoder with adaptive instance normalization to construct the enhancement distribution. After that, we adopt a consensus process to predict a deterministic result based on a set of samples from the distribution. By learning the enhancement distribution, our method can cope with the bias introduced in the reference map labeling to some extent. Additionally, the consensus process is useful to capture a robust and stable result. We examined the proposed method on two widely used real-world underwater image enhancement datasets. Experimental results demonstrate that our approach enables sampling possible enhancement predictions. Meanwhile, the consensus estimate yields competitive performance compared with state-of-the-art UIE methods. Code available at https://github.com/zhenqifu/PUIE-Net.

Viaarxiv icon

Knowledge Condensation Distillation

Jul 12, 2022
Chenxin Li, Mingbao Lin, Zhiyuan Ding, Nie Lin, Yihong Zhuang, Yue Huang, Xinghao Ding, Liujuan Cao

Figure 1 for Knowledge Condensation Distillation
Figure 2 for Knowledge Condensation Distillation
Figure 3 for Knowledge Condensation Distillation
Figure 4 for Knowledge Condensation Distillation

Knowledge Distillation (KD) transfers the knowledge from a high-capacity teacher network to strengthen a smaller student. Existing methods focus on excavating the knowledge hints and transferring the whole knowledge to the student. However, the knowledge redundancy arises since the knowledge shows different values to the student at different learning stages. In this paper, we propose Knowledge Condensation Distillation (KCD). Specifically, the knowledge value on each sample is dynamically estimated, based on which an Expectation-Maximization (EM) framework is forged to iteratively condense a compact knowledge set from the teacher to guide the student learning. Our approach is easy to build on top of the off-the-shelf KD methods, with no extra training parameters and negligible computation overhead. Thus, it presents one new perspective for KD, in which the student that actively identifies teacher's knowledge in line with its aptitude can learn to learn more effectively and efficiently. Experiments on standard benchmarks manifest that the proposed KCD can well boost the performance of student model with even higher distillation efficiency. Code is available at https://github.com/dzy3/KCD.

* ECCV2022 
Viaarxiv icon

Relation Matters: Foreground-aware Graph-based Relational Reasoning for Domain Adaptive Object Detection

Jun 06, 2022
Chaoqi Chen, Jiongcheng Li, Hong-Yu Zhou, Xiaoguang Han, Yue Huang, Xinghao Ding, Yizhou Yu

Figure 1 for Relation Matters: Foreground-aware Graph-based Relational Reasoning for Domain Adaptive Object Detection
Figure 2 for Relation Matters: Foreground-aware Graph-based Relational Reasoning for Domain Adaptive Object Detection
Figure 3 for Relation Matters: Foreground-aware Graph-based Relational Reasoning for Domain Adaptive Object Detection
Figure 4 for Relation Matters: Foreground-aware Graph-based Relational Reasoning for Domain Adaptive Object Detection

Domain Adaptive Object Detection (DAOD) focuses on improving the generalization ability of object detectors via knowledge transfer. Recent advances in DAOD strive to change the emphasis of the adaptation process from global to local in virtue of fine-grained feature alignment methods. However, both the global and local alignment approaches fail to capture the topological relations among different foreground objects as the explicit dependencies and interactions between and within domains are neglected. In this case, only seeking one-vs-one alignment does not necessarily ensure the precise knowledge transfer. Moreover, conventional alignment-based approaches may be vulnerable to catastrophic overfitting regarding those less transferable regions (e.g. backgrounds) due to the accumulation of inaccurate localization results in the target domain. To remedy these issues, we first formulate DAOD as an open-set domain adaptation problem, in which the foregrounds and backgrounds are seen as the ``known classes'' and ``unknown class'' respectively. Accordingly, we propose a new and general framework for DAOD, named Foreground-aware Graph-based Relational Reasoning (FGRR), which incorporates graph structures into the detection pipeline to explicitly model the intra- and inter-domain foreground object relations on both pixel and semantic spaces, thereby endowing the DAOD model with the capability of relational reasoning beyond the popular alignment-based paradigm. The inter-domain visual and semantic correlations are hierarchically modeled via bipartite graph structures, and the intra-domain relations are encoded via graph attention mechanisms. Empirical results demonstrate that the proposed FGRR exceeds the state-of-the-art performance on four DAOD benchmarks.

* Accepted by IEEE T-PAMI 
Viaarxiv icon

A Closer Look at Personalization in Federated Image Classification

Apr 22, 2022
Changxing Jing, Yan Huang, Yihong Zhuang, Liyan Sun, Yue Huang, Zhenlong Xiao, Xinghao Ding

Figure 1 for A Closer Look at Personalization in Federated Image Classification
Figure 2 for A Closer Look at Personalization in Federated Image Classification
Figure 3 for A Closer Look at Personalization in Federated Image Classification
Figure 4 for A Closer Look at Personalization in Federated Image Classification

Federated Learning (FL) is developed to learn a single global model across the decentralized data, while is susceptible when realizing client-specific personalization in the presence of statistical heterogeneity. However, studies focus on learning a robust global model or personalized classifiers, which yield divergence due to inconsistent objectives. This paper shows that it is possible to achieve flexible personalization after the convergence of the global model by introducing representation learning. In this paper, we first analyze and determine that non-IID data harms representation learning of the global model. Existing FL methods adhere to the scheme of jointly learning representations and classifiers, where the global model is an average of classification-based local models that are consistently subject to heterogeneity from non-IID data. As a solution, we separate representation learning from classification learning in FL and propose RepPer, an independent two-stage personalized FL framework.We first learn the client-side feature representation models that are robust to non-IID data and aggregate them into a global common representation model. After that, we achieve personalization by learning a classifier head for each client, based on the common representation obtained at the former stage. Notably, the proposed two-stage learning scheme of RepPer can be potentially used for lightweight edge computing that involves devices with constrained computation power.Experiments on various datasets (CIFAR-10/100, CINIC-10) and heterogeneous data setup show that RepPer outperforms alternatives in flexibility and personalization on non-IID data.

* 14 pages, 5 figures 
Viaarxiv icon

AFSC: Adaptive Fourier Space Compression for Anomaly Detection

Apr 17, 2022
Haote Xu, Yunlong Zhang, Liyan Sun, Chenxin Li, Yue Huang, Xinghao Ding

Figure 1 for AFSC: Adaptive Fourier Space Compression for Anomaly Detection
Figure 2 for AFSC: Adaptive Fourier Space Compression for Anomaly Detection
Figure 3 for AFSC: Adaptive Fourier Space Compression for Anomaly Detection
Figure 4 for AFSC: Adaptive Fourier Space Compression for Anomaly Detection

Anomaly Detection (AD) on medical images enables a model to recognize any type of anomaly pattern without lesion-specific supervised learning. Data augmentation based methods construct pseudo-healthy images by "pasting" fake lesions on real healthy ones, and a network is trained to predict healthy images in a supervised manner. The lesion can be found by difference between the unhealthy input and pseudo-healthy output. However, using only manually designed fake lesions fail to approximate to irregular real lesions, hence limiting the model generalization. We assume by exploring the intrinsic data property within images, we can distinguish previously unseen lesions from healthy regions in an unhealthy image. In this study, we propose an Adaptive Fourier Space Compression (AFSC) module to distill healthy feature for AD. The compression of both magnitude and phase in frequency domain addresses the hyper intensity and diverse position of lesions. Experimental results on the BraTS and MS-SEG datasets demonstrate an AFSC baseline is able to produce promising detection results, and an AFSC module can be effectively embedded into existing AD methods.

* 9 pages, 2 figures 
Viaarxiv icon