Alert button
Picture for Deyu Meng

Deyu Meng

Alert button

Provable Tensor Completion with Graph Information

Oct 04, 2023
Kaidong Wang, Yao Wang, Xiuwu Liao, Shaojie Tang, Can Yang, Deyu Meng

Graphs, depicting the interrelations between variables, has been widely used as effective side information for accurate data recovery in various matrix/tensor recovery related applications. In this paper, we study the tensor completion problem with graph information. Current research on graph-regularized tensor completion tends to be task-specific, lacking generality and systematic approaches. Furthermore, a recovery theory to ensure performance remains absent. Moreover, these approaches overlook the dynamic aspects of graphs, treating them as static akin to matrices, even though graphs could exhibit dynamism in tensor-related scenarios. To confront these challenges, we introduce a pioneering framework in this paper that systematically formulates a novel model, theory, and algorithm for solving the dynamic graph regularized tensor completion problem. For the model, we establish a rigorous mathematical representation of the dynamic graph, based on which we derive a new tensor-oriented graph smoothness regularization. By integrating this regularization into a tensor decomposition model based on transformed t-SVD, we develop a comprehensive model simultaneously capturing the low-rank and similarity structure of the tensor. In terms of theory, we showcase the alignment between the proposed graph smoothness regularization and a weighted tensor nuclear norm. Subsequently, we establish assurances of statistical consistency for our model, effectively bridging a gap in the theoretical examination of the problem involving tensor recovery with graph information. In terms of the algorithm, we develop a solution of high effectiveness, accompanied by a guaranteed convergence, to address the resulting model. To showcase the prowess of our proposed model in contrast to established ones, we provide in-depth numerical experiments encompassing synthetic data as well as real-world datasets.

Viaarxiv icon

FRS-Nets: Fourier Parameterized Rotation and Scale Equivariant Networks for Retinal Vessel Segmentation

Sep 27, 2023
Zihong Sun, Qi Xie, Deyu Meng

With translation equivariance, convolution neural networks (CNNs) have achieved great success in retinal vessel segmentation. However, some other symmetries of the vascular morphology are not characterized by CNNs, such as rotation and scale symmetries. To embed more equivariance into CNNs and achieve the accuracy requirement for retinal vessel segmentation, we construct a novel convolution operator (FRS-Conv), which is Fourier parameterized and equivariant to rotation and scaling. Specifically, we first adopt a new parameterization scheme, which enables convolutional filters to arbitrarily perform transformations with high accuracy. Secondly, we derive the formulations for the rotation and scale equivariant convolution mapping. Finally, we construct FRS-Conv following the proposed formulations and replace the traditional convolution filters in U-Net and Iter-Net with FRS-Conv (FRS-Nets). We faithfully reproduce all compared methods and conduct comprehensive experiments on three public datasets under both in-dataset and cross-dataset settings. With merely 13.9% parameters of corresponding baselines, FRS-Nets have achieved state-of-the-art performance and significantly outperform all compared methods. It demonstrates the remarkable accuracy, generalization, and clinical application potential of FRS-Nets.

Viaarxiv icon

Neural Gradient Regularizer

Sep 13, 2023
Shuang Xu, Yifan Wang, Zixiang Zhao, Jiangjun Peng, Xiangyong Cao, Deyu Meng, Yulun Zhang, Radu Timofte, Luc Van Gool

Figure 1 for Neural Gradient Regularizer
Figure 2 for Neural Gradient Regularizer
Figure 3 for Neural Gradient Regularizer
Figure 4 for Neural Gradient Regularizer

Owing to its significant success, the prior imposed on gradient maps has consistently been a subject of great interest in the field of image processing. Total variation (TV), one of the most representative regularizers, is known for its ability to capture the intrinsic sparsity prior underlying gradient maps. Nonetheless, TV and its variants often underestimate the gradient maps, leading to the weakening of edges and details whose gradients should not be zero in the original image (i.e., image structures is not describable by sparse priors of gradient maps). Recently, total deep variation (TDV) has been introduced, assuming the sparsity of feature maps, which provides a flexible regularization learned from large-scale datasets for a specific task. However, TDV requires to retrain the network with image/task variations, limiting its versatility. To alleviate this issue, in this paper, we propose a neural gradient regularizer (NGR) that expresses the gradient map as the output of a neural network. Unlike existing methods, NGR does not rely on any subjective sparsity or other prior assumptions on image gradient maps, thereby avoiding the underestimation of gradient maps. NGR is applicable to various image types and different image processing tasks, functioning in a zero-shot learning fashion, making it a versatile and plug-and-play regularizer. Extensive experimental results demonstrate the superior performance of NGR over state-of-the-art counterparts for a range of different tasks, further validating its effectiveness and versatility.

Viaarxiv icon

Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video Restoration

Sep 07, 2023
Yuanshuo Cheng, Mingwen Shao, Yecong Wan, Lixu Zhang, Wangmeng Zuo, Deyu Meng

Figure 1 for Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video Restoration
Figure 2 for Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video Restoration
Figure 3 for Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video Restoration
Figure 4 for Cross-Consistent Deep Unfolding Network for Adaptive All-In-One Video Restoration

Existing Video Restoration (VR) methods always necessitate the individual deployment of models for each adverse weather to remove diverse adverse weather degradations, lacking the capability for adaptive processing of degradations. Such limitation amplifies the complexity and deployment costs in practical applications. To overcome this deficiency, in this paper, we propose a Cross-consistent Deep Unfolding Network (CDUN) for All-In-One VR, which enables the employment of a single model to remove diverse degradations for the first time. Specifically, the proposed CDUN accomplishes a novel iterative optimization framework, capable of restoring frames corrupted by corresponding degradations according to the degradation features given in advance. To empower the framework for eliminating diverse degradations, we devise a Sequence-wise Adaptive Degradation Estimator (SADE) to estimate degradation features for the input corrupted video. By orchestrating these two cascading procedures, CDUN achieves adaptive processing for diverse degradation. In addition, we introduce a window-based inter-frame fusion strategy to utilize information from more adjacent frames. This strategy involves the progressive stacking of temporal windows in multiple iterations, effectively enlarging the temporal receptive field and enabling each frame's restoration to leverage information from distant frames. Extensive experiments demonstrate that the proposed method achieves state-of-the-art performance in All-In-One VR.

* 16 pages, 13 figures 
Viaarxiv icon

CA2: Class-Agnostic Adaptive Feature Adaptation for One-class Classification

Sep 04, 2023
Zilong Zhang, Zhibin Zhao, Deyu Meng, Xingwu Zhang, Xuefeng Chen

One-class classification (OCC), i.e., identifying whether an example belongs to the same distribution as the training data, is essential for deploying machine learning models in the real world. Adapting the pre-trained features on the target dataset has proven to be a promising paradigm for improving OCC performance. Existing methods are constrained by assumptions about the number of classes. This contradicts the real scenario where the number of classes is unknown. In this work, we propose a simple class-agnostic adaptive feature adaptation method (CA2). We generalize the center-based method to unknown classes and optimize this objective based on the prior existing in the pre-trained network, i.e., pre-trained features that belong to the same class are adjacent. CA2 is validated to consistently improve OCC performance across a spectrum of training data classes, spanning from 1 to 1024, outperforming current state-of-the-art methods. Code is available at https://github.com/zhangzilongc/CA2.

* Submit to AAAI 2024 
Viaarxiv icon

CBA: Improving Online Continual Learning via Continual Bias Adaptor

Aug 14, 2023
Quanziang Wang, Renzhen Wang, Yichen Wu, Xixi Jia, Deyu Meng

Figure 1 for CBA: Improving Online Continual Learning via Continual Bias Adaptor
Figure 2 for CBA: Improving Online Continual Learning via Continual Bias Adaptor
Figure 3 for CBA: Improving Online Continual Learning via Continual Bias Adaptor
Figure 4 for CBA: Improving Online Continual Learning via Continual Bias Adaptor

Online continual learning (CL) aims to learn new knowledge and consolidate previously learned knowledge from non-stationary data streams. Due to the time-varying training setting, the model learned from a changing distribution easily forgets the previously learned knowledge and biases toward the newly received task. To address this problem, we propose a Continual Bias Adaptor (CBA) module to augment the classifier network to adapt to catastrophic distribution change during training, such that the classifier network is able to learn a stable consolidation of previously learned tasks. In the testing stage, CBA can be removed which introduces no additional computation cost and memory overhead. We theoretically reveal the reason why the proposed method can effectively alleviate catastrophic distribution shifts, and empirically demonstrate its effectiveness through extensive experiments based on four rehearsal-based baselines and three public continual learning benchmarks.

* Accepted by ICCV 2023 
Viaarxiv icon

HIDFlowNet: A Flow-Based Deep Network for Hyperspectral Image Denoising

Jun 20, 2023
Li Pang, Weizhen Gu, Xiangyong Cao, Xiangyu Rui, Jiangjun Peng, Shuang Xu, Gang Yang, Deyu Meng

Figure 1 for HIDFlowNet: A Flow-Based Deep Network for Hyperspectral Image Denoising
Figure 2 for HIDFlowNet: A Flow-Based Deep Network for Hyperspectral Image Denoising
Figure 3 for HIDFlowNet: A Flow-Based Deep Network for Hyperspectral Image Denoising
Figure 4 for HIDFlowNet: A Flow-Based Deep Network for Hyperspectral Image Denoising

Hyperspectral image (HSI) denoising is essentially ill-posed since a noisy HSI can be degraded from multiple clean HSIs. However, current deep learning-based approaches ignore this fact and restore the clean image with deterministic mapping (i.e., the network receives a noisy HSI and outputs a clean HSI). To alleviate this issue, this paper proposes a flow-based HSI denoising network (HIDFlowNet) to directly learn the conditional distribution of the clean HSI given the noisy HSI and thus diverse clean HSIs can be sampled from the conditional distribution. Overall, our HIDFlowNet is induced from the flow methodology and contains an invertible decoder and a conditional encoder, which can fully decouple the learning of low-frequency and high-frequency information of HSI. Specifically, the invertible decoder is built by staking a succession of invertible conditional blocks (ICBs) to capture the local high-frequency details since the invertible network is information-lossless. The conditional encoder utilizes down-sampling operations to obtain low-resolution images and uses transformers to capture correlations over a long distance so that global low-frequency information can be effectively extracted. Extensive experimental results on simulated and real HSI datasets verify the superiority of our proposed HIDFlowNet compared with other state-of-the-art methods both quantitatively and visually.

* 10 pages, 8 figures 
Viaarxiv icon

A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition

Jun 16, 2023
Yuntao Shou, Xiangyong Cao, Deyu Meng, Bo Dong, Qinghua Zheng

Figure 1 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition
Figure 2 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition
Figure 3 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition
Figure 4 for A Low-rank Matching Attention based Cross-modal Feature Fusion Method for Conversational Emotion Recognition

Conversational emotion recognition (CER) is an important research topic in human-computer interactions. Although deep learning (DL) based CER approaches have achieved excellent performance, existing cross-modal feature fusion methods used in these DL-based approaches either ignore the intra-modal and inter-modal emotional interaction or have high computational complexity. To address these issues, this paper develops a novel cross-modal feature fusion method for the CER task, i.e., the low-rank matching attention method (LMAM). By setting a matching weight and calculating attention scores between modal features row by row, LMAM contains fewer parameters than the self-attention method. We further utilize the low-rank decomposition method on the weight to make the parameter number of LMAM less than one-third of the self-attention. Therefore, LMAM can potentially alleviate the over-fitting issue caused by a large number of parameters. Additionally, by computing and fusing the similarity of intra-modal and inter-modal features, LMAM can also fully exploit the intra-modal contextual information within each modality and the complementary semantic information across modalities (i.e., text, video and audio) simultaneously. Experimental results on some benchmark datasets show that LMAM can be embedded into any existing state-of-the-art DL-based CER methods and help boost their performance in a plug-and-play manner. Also, experimental results verify the superiority of LMAM compared with other popular cross-modal fusion methods. Moreover, LMAM is a general cross-modal fusion method and can thus be applied to other multi-modal recognition tasks, e.g., session recommendation and humour detection.

* 10 pages, 4 figures 
Viaarxiv icon

Masked Contrastive Graph Representation Learning for Age Estimation

Jun 16, 2023
Yuntao Shou, Xiangyong Cao, Deyu Meng

Figure 1 for Masked Contrastive Graph Representation Learning for Age Estimation
Figure 2 for Masked Contrastive Graph Representation Learning for Age Estimation
Figure 3 for Masked Contrastive Graph Representation Learning for Age Estimation
Figure 4 for Masked Contrastive Graph Representation Learning for Age Estimation

Age estimation of face images is a crucial task with various practical applications in areas such as video surveillance and Internet access control. While deep learning-based age estimation frameworks, e.g., convolutional neural network (CNN), multi-layer perceptrons (MLP), and transformers have shown remarkable performance, they have limitations when modelling complex or irregular objects in an image that contains a large amount of redundant information. To address this issue, this paper utilizes the robustness property of graph representation learning in dealing with image redundancy information and proposes a novel Masked Contrastive Graph Representation Learning (MCGRL) method for age estimation. Specifically, our approach first leverages CNN to extract semantic features of the image, which are then partitioned into patches that serve as nodes in the graph. Then, we use a masked graph convolutional network (GCN) to derive image-based node representations that capture rich structural information. Finally, we incorporate multiple losses to explore the complementary relationship between structural information and semantic features, which improves the feature representation capability of GCN. Experimental results on real-world face image datasets demonstrate the superiority of our proposed method over other state-of-the-art age estimation approaches.

* 10 pages, 7 figures 
Viaarxiv icon