Alert button
Picture for Pan Wang

Pan Wang

Alert button

FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing

Aug 14, 2023
Pan Wang, Zeyi Li, Mengyi Fu, Zixuan Wang, Ze Zhang, MinYao Liu

Figure 1 for FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing
Figure 2 for FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing
Figure 3 for FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing
Figure 4 for FedEdge AI-TC: A Semi-supervised Traffic Classification Method based on Trusted Federated Deep Learning for Mobile Edge Computing

As a typical entity of MEC (Mobile Edge Computing), 5G CPE (Customer Premise Equipment)/HGU (Home Gateway Unit) has proven to be a promising alternative to traditional Smart Home Gateway. Network TC (Traffic Classification) is a vital service quality assurance and security management method for communication networks, which has become a crucial functional entity in 5G CPE/HGU. In recent years, many researchers have applied Machine Learning or Deep Learning (DL) to TC, namely AI-TC, to improve its performance. However, AI-TC faces challenges, including data dependency, resource-intensive traffic labeling, and user privacy concerns. The limited computing resources of 5G CPE further complicate efficient classification. Moreover, the "black box" nature of AI-TC models raises transparency and credibility issues. The paper proposes the FedEdge AI-TC framework, leveraging Federated Learning (FL) for reliable Network TC in 5G CPE. FL ensures privacy by employing local training, model parameter iteration, and centralized training. A semi-supervised TC algorithm based on Variational Auto-Encoder (VAE) and convolutional neural network (CNN) reduces data dependency while maintaining accuracy. To optimize model light-weight deployment, the paper introduces XAI-Pruning, an AI model compression method combined with DL model interpretability. Experimental evaluation demonstrates FedEdge AI-TC's superiority over benchmarks in terms of accuracy and efficient TC performance. The framework enhances user privacy and model credibility, offering a comprehensive solution for dependable and transparent Network TC in 5G CPE, thus enhancing service quality and security.

* 13 pages, 13 figures 
Viaarxiv icon

Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation

Aug 02, 2023
Guojin Zhong, Jin Yuan, Pan Wang, Kailun Yang, Weili Guan, Zhiyong Li

Figure 1 for Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation
Figure 2 for Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation
Figure 3 for Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation
Figure 4 for Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment for Markup-to-Image Generation

The recently rising markup-to-image generation poses greater challenges as compared to natural image generation, due to its low tolerance for errors as well as the complex sequence and context correlations between markup and rendered image. This paper proposes a novel model named "Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment" (FSA-CDM), which introduces contrastive positive/negative samples into the diffusion model to boost performance for markup-to-image generation. Technically, we design a fine-grained cross-modal alignment module to well explore the sequence similarity between the two modalities for learning robust feature representations. To improve the generalization ability, we propose a contrast-augmented diffusion model to explicitly explore positive and negative samples by maximizing a novel contrastive variational objective, which is mathematically inferred to provide a tighter bound for the model's optimization. Moreover, the context-aware cross attention module is developed to capture the contextual information within markup language during the denoising process, yielding better noise prediction results. Extensive experiments are conducted on four benchmark datasets from different domains, and the experimental results demonstrate the effectiveness of the proposed components in FSA-CDM, significantly exceeding state-of-the-art performance by about 2%-12% DTW improvements. The code will be released at https://github.com/zgj77/FSACDM.

* Accepted to ACM MM 2023. The code will be released at https://github.com/zgj77/FSACDM 
Viaarxiv icon

Multi-scale multi-modal micro-expression recognition algorithm based on transformer

Jan 11, 2023
Fengping Wang, Jie Li, Chun Qi, Lin Wang, Pan Wang

Figure 1 for Multi-scale multi-modal micro-expression recognition algorithm based on transformer
Figure 2 for Multi-scale multi-modal micro-expression recognition algorithm based on transformer
Figure 3 for Multi-scale multi-modal micro-expression recognition algorithm based on transformer
Figure 4 for Multi-scale multi-modal micro-expression recognition algorithm based on transformer

A micro-expression is a spontaneous unconscious facial muscle movement that can reveal the true emotions people attempt to hide. Although manual methods have made good progress and deep learning is gaining prominence. Due to the short duration of micro-expression and different scales of expressed in facial regions, existing algorithms cannot extract multi-modal multi-scale facial region features while taking into account contextual information to learn underlying features. Therefore, in order to solve the above problems, a multi-modal multi-scale algorithm based on transformer network is proposed in this paper, aiming to fully learn local multi-grained features of micro-expressions through two modal features of micro-expressions - motion features and texture features. To obtain local area features of the face at different scales, we learned patch features at different scales for both modalities, and then fused multi-layer multi-headed attention weights to obtain effective features by weighting the patch features, and combined cross-modal contrastive learning for model optimization. We conducted comprehensive experiments on three spontaneous datasets, and the results show the accuracy of the proposed algorithm in single measurement SMIC database is up to 78.73% and the F1 value on CASMEII of the combined database is up to 0.9071, which is at the leading level.

Viaarxiv icon

Reconstruction of compressed spectral imaging based on global structure and spectral correlation

Oct 27, 2022
Pan Wang, Jie Li, Siqi Zhang, Chun Qi, Lin Wang, Jieru Chen

Figure 1 for Reconstruction of compressed spectral imaging based on global structure and spectral correlation
Figure 2 for Reconstruction of compressed spectral imaging based on global structure and spectral correlation
Figure 3 for Reconstruction of compressed spectral imaging based on global structure and spectral correlation

In this paper, a convolution sparse coding method based on global structure characteristics and spectral correlation is proposed for the reconstruction of compressive spectral images. The proposed method uses the convolution kernel to operate the global image, which can better preserve image structure information in the spatial dimension. To take full exploration of the constraints between spectra, the coefficients corresponding to the convolution kernel are constrained by the norm to improve spectral accuracy. And, to solve the problem that convolutional sparse coding is insensitive to low frequency, the global total-variation (TV) constraint is added to estimate the low-frequency components. It not only ensures the effective estimation of the low-frequency but also transforms the convolutional sparse coding into a de-noising process, which makes the reconstructing process simpler. Simulations show that compared with the current mainstream optimization methods (DeSCI and Gap-TV), the proposed method improves the reconstruction quality by up to 7 dB in PSNR and 10% in SSIM, and has a great improvement in the details of the reconstructed image.

Viaarxiv icon

A Semantic Consistency Feature Alignment Object Detection Model Based on Mixed-Class Distribution Metrics

Jun 12, 2022
Lijun Gou, Jinrong Yang, Hangcheng Yu, Pan Wang, Xiaoping Li, Chao Deng

Figure 1 for A Semantic Consistency Feature Alignment Object Detection Model Based on Mixed-Class Distribution Metrics
Figure 2 for A Semantic Consistency Feature Alignment Object Detection Model Based on Mixed-Class Distribution Metrics
Figure 3 for A Semantic Consistency Feature Alignment Object Detection Model Based on Mixed-Class Distribution Metrics
Figure 4 for A Semantic Consistency Feature Alignment Object Detection Model Based on Mixed-Class Distribution Metrics

Unsupervised domain adaptation is critical in various computer vision tasks, such as object detection, instance segmentation, etc. They attempt to reduce domain bias-induced performance degradation while also promoting model application speed. Previous works in domain adaptation object detection attempt to align image-level and instance-level shifts to eventually minimize the domain discrepancy, but they may align single-class features to mixed-class features in image-level domain adaptation because each image in the object detection task may be more than one class and object. In order to achieve single-class with single-class alignment and mixed-class with mixed-class alignment, we treat the mixed-class of the feature as a new class and propose a mixed-classes $H-divergence$ for object detection to achieve homogenous feature alignment and reduce negative transfer. Then, a Semantic Consistency Feature Alignment Model (SCFAM) based on mixed-classes $H-divergence$ was also presented. To improve single-class and mixed-class semantic information and accomplish semantic separation, the SCFAM model proposes Semantic Prediction Models (SPM) and Semantic Bridging Components (SBC). And the weight of the pix domain discriminator loss is then changed based on the SPM result to reduce sample imbalance. Extensive unsupervised domain adaption experiments on widely used datasets illustrate our proposed approach's robust object detection in domain bias settings.

Viaarxiv icon

Product semantics translation from brain activity via adversarial learning

Mar 29, 2021
Pan Wang, Zhifeng Gong, Shuo Wang, Hao Dong, Jialu Fan, Ling Li, Peter Childs, Yike Guo

Figure 1 for Product semantics translation from brain activity via adversarial learning
Figure 2 for Product semantics translation from brain activity via adversarial learning
Figure 3 for Product semantics translation from brain activity via adversarial learning
Figure 4 for Product semantics translation from brain activity via adversarial learning

A small change of design semantics may affect a user's satisfaction with a product. To modify a design semantic of a given product from personalised brain activity via adversarial learning, in this work, we propose a deep generative transformation model to modify product semantics from the brain signal. We attempt to accomplish such synthesis: 1) synthesising the product image with new features corresponding to EEG signal; 2) maintaining the other image features that irrelevant to EEG signal. We leverage the idea of StarGAN and the model is designed to synthesise products with preferred design semantics (colour & shape) via adversarial learning from brain activity, and is applied with a case study to generate shoes with different design semantics from recorded EEG signals. To verify our proposed cognitive transformation model, a case study has been presented. The results work as a proof-of-concept that our framework has the potential to synthesis product semantic from brain activity.

Viaarxiv icon

Implicit Subspace Prior Learning for Dual-Blind Face Restoration

Oct 12, 2020
Lingbo Yang, Pan Wang, Zhanning Gao, Shanshe Wang, Peiran Ren, Siwei Ma, Wen Gao

Figure 1 for Implicit Subspace Prior Learning for Dual-Blind Face Restoration
Figure 2 for Implicit Subspace Prior Learning for Dual-Blind Face Restoration
Figure 3 for Implicit Subspace Prior Learning for Dual-Blind Face Restoration
Figure 4 for Implicit Subspace Prior Learning for Dual-Blind Face Restoration

Face restoration is an inherently ill-posed problem, where additional prior constraints are typically considered crucial for mitigating such pathology. However, real-world image prior are often hard to simulate with precise mathematical models, which inevitably limits the performance and generalization ability of existing prior-regularized restoration methods. In this paper, we study the problem of face restoration under a more practical ``dual blind'' setting, i.e., without prior assumptions or hand-crafted regularization terms on the degradation profile or image contents. To this end, a novel implicit subspace prior learning (ISPL) framework is proposed as a generic solution to dual-blind face restoration, with two key elements: 1) an implicit formulation to circumvent the ill-defined restoration mapping and 2) a subspace prior decomposition and fusion mechanism to dynamically handle inputs at varying degradation levels with consistent high-quality restoration results. Experimental results demonstrate significant perception-distortion improvement of ISPL against existing state-of-the-art methods for a variety of restoration subtasks, including a 3.69db PSNR and 45.8% FID gain against ESRGAN, the 2018 NTIRE SR challenge winner. Overall, we prove that it is possible to capture and utilize prior knowledge without explicitly formulating it, which will help inspire new research paradigms towards low-level vision tasks.

* TPAMI submission 
Viaarxiv icon

Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

May 26, 2020
Lingbo Yang, Pan Wang, Chang Liu, Zhanning Gao, Peiran Ren, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Xiansheng Hua, Wen Gao

Figure 1 for Towards Fine-grained Human Pose Transfer with Detail Replenishing Network
Figure 2 for Towards Fine-grained Human Pose Transfer with Detail Replenishing Network
Figure 3 for Towards Fine-grained Human Pose Transfer with Detail Replenishing Network
Figure 4 for Towards Fine-grained Human Pose Transfer with Detail Replenishing Network

Human pose transfer (HPT) is an emerging research topic with huge potential in fashion design, media production, online advertising and virtual reality. For these applications, the visual realism of fine-grained appearance details is crucial for production quality and user engagement. However, existing HPT methods often suffer from three fundamental issues: detail deficiency, content ambiguity and style inconsistency, which severely degrade the visual quality and realism of generated images. Aiming towards real-world applications, we develop a more challenging yet practical HPT setting, termed as Fine-grained Human Pose Transfer (FHPT), with a higher focus on semantic fidelity and detail replenishment. Concretely, we analyze the potential design flaws of existing methods via an illustrative example, and establish the core FHPT methodology by combing the idea of content synthesis and feature transfer together in a mutually-guided fashion. Thereafter, we substantiate the proposed methodology with a Detail Replenishing Network (DRN) and a corresponding coarse-to-fine model training scheme. Moreover, we build up a complete suite of fine-grained evaluation protocols to address the challenges of FHPT in a comprehensive manner, including semantic analysis, structural detection and perceptual quality assessment. Extensive experiments on the DeepFashion benchmark dataset have verified the power of proposed benchmark against start-of-the-art works, with 12\%-14\% gain on top-10 retrieval recall, 5\% higher joint localization accuracy, and near 40\% gain on face identity preservation. Moreover, the evaluation results offer further insights to the subject matter, which could inspire many promising future works along this direction.

* IEEE TIP submission 
Viaarxiv icon

Region-adaptive Texture Enhancement for Detailed Person Image Synthesis

May 26, 2020
Lingbo Yang, Pan Wang, Xinfeng Zhang, Shanshe Wang, Zhanning Gao, Peiran Ren, Xuansong Xie, Siwei Ma, Wen Gao

Figure 1 for Region-adaptive Texture Enhancement for Detailed Person Image Synthesis
Figure 2 for Region-adaptive Texture Enhancement for Detailed Person Image Synthesis
Figure 3 for Region-adaptive Texture Enhancement for Detailed Person Image Synthesis
Figure 4 for Region-adaptive Texture Enhancement for Detailed Person Image Synthesis

The ability to produce convincing textural details is essential for the fidelity of synthesized person images. However, existing methods typically follow a ``warping-based'' strategy that propagates appearance features through the same pathway used for pose transfer. However, most fine-grained features would be lost due to down-sampling, leading to over-smoothed clothes and missing details in the output images. In this paper we presents RATE-Net, a novel framework for synthesizing person images with sharp texture details. The proposed framework leverages an additional texture enhancing module to extract appearance information from the source image and estimate a fine-grained residual texture map, which helps to refine the coarse estimation from the pose transfer module. In addition, we design an effective alternate updating strategy to promote mutual guidance between two modules for better shape and appearance consistency. Experiments conducted on DeepFashion benchmark dataset have demonstrated the superiority of our framework compared with existing networks.

* Accepted in ICME 2020 oral, Recommended for TMM journal 
Viaarxiv icon