Alert button
Picture for Qianqian Yang

Qianqian Yang

Alert button

Semantic-preserved Communication System for Highly Efficient Speech Transmission

May 25, 2022
Tianxiao Han, Qianqian Yang, Zhiguo Shi, Shibo He, Zhaoyang Zhang

Figure 1 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 2 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 3 for Semantic-preserved Communication System for Highly Efficient Speech Transmission
Figure 4 for Semantic-preserved Communication System for Highly Efficient Speech Transmission

Deep learning (DL) based semantic communication methods have been explored for the efficient transmission of images, text, and speech in recent years. In contrast to traditional wireless communication methods that focus on the transmission of abstract symbols, semantic communication approaches attempt to achieve better transmission efficiency by only sending the semantic-related information of the source data. In this paper, we consider semantic-oriented speech transmission which transmits only the semantic-relevant information over the channel for the speech recognition task, and a compact additional set of semantic-irrelevant information for the speech reconstruction task. We propose a novel end-to-end DL-based transceiver which extracts and encodes the semantic information from the input speech spectrums at the transmitter and outputs the corresponding transcriptions from the decoded semantic information at the receiver. For the speech to speech transmission, we further include a CTC alignment module that extracts a small number of additional semantic-irrelevant but speech-related information for the better reconstruction of the original speech signals at the receiver. The simulation results confirm that our proposed method outperforms current methods in terms of the accuracy of the predicted text for the speech to text transmission and the quality of the recovered speech signals for the speech to speech transmission, and significantly improves transmission efficiency. More specifically, the proposed method only sends 16% of the amount of the transmitted symbols required by the existing methods while achieving about 10% reduction in WER for the speech to text transmission. For the speech to speech transmission, it results in an even more remarkable improvement in terms of transmission efficiency with only 0.2% of the amount of the transmitted symbols required by the existing method.

* arXiv admin note: substantial text overlap with arXiv:2202.03211 
Viaarxiv icon

OTFPF: Optimal Transport-Based Feature Pyramid Fusion Network for Brain Age Estimation with 3D Overlapped ConvNeXt

May 11, 2022
Yu Fu, Yanyan Huang, Yalin Wang, Shunjie Dong, Le Xue, Xunzhao Yin, Qianqian Yang, Yiyu Shi, Cheng Zhuo

Figure 1 for OTFPF: Optimal Transport-Based Feature Pyramid Fusion Network for Brain Age Estimation with 3D Overlapped ConvNeXt
Figure 2 for OTFPF: Optimal Transport-Based Feature Pyramid Fusion Network for Brain Age Estimation with 3D Overlapped ConvNeXt
Figure 3 for OTFPF: Optimal Transport-Based Feature Pyramid Fusion Network for Brain Age Estimation with 3D Overlapped ConvNeXt
Figure 4 for OTFPF: Optimal Transport-Based Feature Pyramid Fusion Network for Brain Age Estimation with 3D Overlapped ConvNeXt

Chronological age of healthy brain is able to be predicted using deep neural networks from T1-weighted magnetic resonance images (T1 MRIs), and the predicted brain age could serve as an effective biomarker for detecting aging-related diseases or disorders. In this paper, we propose an end-to-end neural network architecture, referred to as optimal transport based feature pyramid fusion (OTFPF) network, for the brain age estimation with T1 MRIs. The OTFPF consists of three types of modules: Optimal Transport based Feature Pyramid Fusion (OTFPF) module, 3D overlapped ConvNeXt (3D OL-ConvNeXt) module and fusion module. These modules strengthen the OTFPF network's understanding of each brain's semi-multimodal and multi-level feature pyramid information, and significantly improve its estimation performances. Comparing with recent state-of-the-art models, the proposed OTFPF converges faster and performs better. The experiments with 11,728 MRIs aged 3-97 years show that OTFPF network could provide accurate brain age estimation, yielding mean absolute error (MAE) of 2.097, Pearson's correlation coefficient (PCC) of 0.993 and Spearman's rank correlation coefficient (SRCC) of 0.989, between the estimated and chronological ages. Widespread quantitative experiments and ablation experiments demonstrate the superiority and rationality of OTFPF network. The codes and implement details will be released on GitHub: https://github.com/ZJU-Brain/OTFPF after final decision.

Viaarxiv icon

A resource-efficient deep learning framework for low-dose brain PET image reconstruction and analysis

Feb 14, 2022
Yu Fu, Shunjie Dong, Yi Liao, Le Xue, Yuanfan Xu, Feng Li, Qianqian Yang, Tianbai Yu, Mei Tian, Cheng Zhuo

Figure 1 for A resource-efficient deep learning framework for low-dose brain PET image reconstruction and analysis
Figure 2 for A resource-efficient deep learning framework for low-dose brain PET image reconstruction and analysis
Figure 3 for A resource-efficient deep learning framework for low-dose brain PET image reconstruction and analysis
Figure 4 for A resource-efficient deep learning framework for low-dose brain PET image reconstruction and analysis

18F-fluorodeoxyglucose (18F-FDG) Positron Emission Tomography (PET) imaging usually needs a full-dose radioactive tracer to obtain satisfactory diagnostic results, which raises concerns about the potential health risks of radiation exposure, especially for pediatric patients. Reconstructing the low-dose PET (L-PET) images to the high-quality full-dose PET (F-PET) ones is an effective way that both reduces the radiation exposure and remains diagnostic accuracy. In this paper, we propose a resource-efficient deep learning framework for L-PET reconstruction and analysis, referred to as transGAN-SDAM, to generate F-PET from corresponding L-PET, and quantify the standard uptake value ratios (SUVRs) of these generated F-PET at whole brain. The transGAN-SDAM consists of two modules: a transformer-encoded Generative Adversarial Network (transGAN) and a Spatial Deformable Aggregation Module (SDAM). The transGAN generates higher quality F-PET images, and then the SDAM integrates the spatial information of a sequence of generated F-PET slices to synthesize whole-brain F-PET images. Experimental results demonstrate the superiority and rationality of our approach.

Viaarxiv icon

Wireless Transmission of Images With The Assistance of Multi-level Semantic Information

Feb 08, 2022
Zhenguo Zhang, Qianqian Yang, Shibo He, Mingyang Sun, Jiming Chen

Figure 1 for Wireless Transmission of Images With The Assistance of Multi-level Semantic Information
Figure 2 for Wireless Transmission of Images With The Assistance of Multi-level Semantic Information
Figure 3 for Wireless Transmission of Images With The Assistance of Multi-level Semantic Information
Figure 4 for Wireless Transmission of Images With The Assistance of Multi-level Semantic Information

Semantic-oriented communication has been considered as a promising to boost the bandwidth efficiency by only transmitting the semantics of the data. In this paper, we propose a multi-level semantic aware communication system for wireless image transmission, named MLSC-image, which is based on the deep learning techniques and trained in an end to end manner. In particular, the proposed model includes a multilevel semantic feature extractor, that extracts both the highlevel semantic information, such as the text semantics and the segmentation semantics, and the low-level semantic information, such as local spatial details of the images. We employ a pretrained image caption to capture the text semantics and a pretrained image segmentation model to obtain the segmentation semantics. These high-level and low-level semantic features are then combined and encoded by a joint semantic and channel encoder into symbols to transmit over the physical channel. The numerical results validate the effectiveness and efficiency of the proposed semantic communication system, especially under the limited bandwidth condition, which indicates the advantages of the high-level semantics in the compression of images.

Viaarxiv icon

Semantic-aware Speech to Text Transmission with Redundancy Removal

Feb 07, 2022
Tianxiao Han, Qianqian Yang, Zhiguo Shi, Shibo He, Zhaoyang Zhang

Figure 1 for Semantic-aware Speech to Text Transmission with Redundancy Removal
Figure 2 for Semantic-aware Speech to Text Transmission with Redundancy Removal
Figure 3 for Semantic-aware Speech to Text Transmission with Redundancy Removal
Figure 4 for Semantic-aware Speech to Text Transmission with Redundancy Removal

Deep learning (DL) based semantic communication methods have been explored for the efficient transmission of images, text, and speech in recent years. In contrast to traditional wireless communication methods that focus on the transmission of abstract symbols, semantic communication approaches attempt to achieve better transmission efficiency by only sending the semantic-related information of the source data. In this paper, we consider semantic-oriented speech to text transmission. We propose a novel end-to-end DL-based transceiver, which includes an attention-based soft alignment module and a redundancy removal module to compress the transmitted data. In particular, the former extracts only the text-related semantic features, and the latter further drops the semantically redundant content, greatly reducing the amount of semantic redundancy compared to existing methods. We also propose a two-stage training scheme, which speeds up the training of the proposed DL model. The simulation results indicate that our proposed method outperforms current methods in terms of the accuracy of the received text and transmission efficiency. Moreover, the proposed method also has a smaller model size and shorter end-to-end runtime.

Viaarxiv icon

Blind Channel Estimation for MIMO Systems via Variational Inference

Nov 16, 2021
Jiancheng Tang, Qianqian Yang, Zhaoyang Zhang

Figure 1 for Blind Channel Estimation for MIMO Systems via Variational Inference
Figure 2 for Blind Channel Estimation for MIMO Systems via Variational Inference
Figure 3 for Blind Channel Estimation for MIMO Systems via Variational Inference
Figure 4 for Blind Channel Estimation for MIMO Systems via Variational Inference

In this paper, we investigate the blind channel estimation problem for MIMO systems under Rayleigh fading channel. Conventional MIMO communication techniques require transmitting a considerable amount of training symbols as pilots in each data block to obtain the channel state information (CSI) such that the transmitted signals can be successfully recovered. However, the pilot overhead and contamination become a bottleneck for the practical application of MIMO systems with the increase of the number of antennas. To overcome this obstacle, we propose a blind channel estimation framework, where we introduce an auxiliary posterior distribution of CSI and the transmitted signals given the received signals to derive a lower bound to the intractable likelihood function of the received signal. Meanwhile, we generate this auxiliary distribution by a neural network based variational inference framework, which is trained by maximizing the lower bound. The optimal auxiliary distribution which approaches real prior distribution is then leveraged to obtain the maximum a posterior (MAP) estimation of channel matrix and transmitted data. The simulation results demonstrate that the performance of the proposed blind channel estimation method closely approaches that of the conventional pilot-aided methods in terms of the channel estimation error and symbol error rate (SER) of the detected signals even without the help of pilots.

Viaarxiv icon

JMSNAS: Joint Model Split and Neural Architecture Search for Learning over Mobile Edge Networks

Nov 16, 2021
Yuqing Tian, Zhaoyang Zhang, Zhaohui Yang, Qianqian Yang

Figure 1 for JMSNAS: Joint Model Split and Neural Architecture Search for Learning over Mobile Edge Networks
Figure 2 for JMSNAS: Joint Model Split and Neural Architecture Search for Learning over Mobile Edge Networks
Figure 3 for JMSNAS: Joint Model Split and Neural Architecture Search for Learning over Mobile Edge Networks
Figure 4 for JMSNAS: Joint Model Split and Neural Architecture Search for Learning over Mobile Edge Networks

The main challenge to deploy deep neural network (DNN) over a mobile edge network is how to split the DNN model so as to match the network architecture as well as all the nodes' computation and communication capacity. This essentially involves two highly coupled procedures: model generating and model splitting. In this paper, a joint model split and neural architecture search (JMSNAS) framework is proposed to automatically generate and deploy a DNN model over a mobile edge network. Considering both the computing and communication resource constraints, a computational graph search problem is formulated to find the multi-split points of the DNN model, and then the model is trained to meet some accuracy requirements. Moreover, the trade-off between model accuracy and completion latency is achieved through the proper design of the objective function. The experiment results confirm the superiority of the proposed framework over the state-of-the-art split machine learning design methods.

* 6 pages, 6 figures, Submitted to IEEE ICC'22 - CRAIN Symposium 
Viaarxiv icon

FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices

Oct 06, 2021
Yuhao Chen, Qianqian Yang, Shibo He, Zhiguo Shi, Jiming Chen

Figure 1 for FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices
Figure 2 for FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices
Figure 3 for FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices
Figure 4 for FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices

With the increased penetration and proliferation of Internet of Things (IoT) devices, there is a growing trend towards distributing the power of deep learning (DL) across edge devices rather than centralizing it in the cloud. This development enables better privacy preservation, real-time responses, and user-specific models. To deploy deep and complex models to edge devices with limited resources, model partitioning of deep neural networks (DNN) model is necessary, and has been widely studied. However, most of the existing literature only considers distributing the inference model while still relying centralized cloud infrastructure to generate this model through training. In this paper, we propose FTPipeHD, a novel DNN training framework that trains DNN models across distributed heterogeneous devices with fault tolerance mechanism. To accelerate the training with time-varying computing power of each device, we optimize the partition points dynamically according to real-time computing capacities. We also propose a novel weight redistribution approach that replicates the weights to both the neighboring nodes and the central node periodically, which combats the failure of multiple devices during training while incurring limited communication cost. Our numerical results demonstrate that FTPipeHD is 6.8x faster in training than the state of the art method when the computing capacity of the best device is 10x greater than the worst one. It is also shown that the proposed method is able to accelerate the training even with the existence of device failures.

* 11 pages, 8 figures 
Viaarxiv icon

Communication-Efficient Federated Learning with Binary Neural Networks

Oct 05, 2021
Yuzhi Yang, Zhaoyang Zhang, Qianqian Yang

Figure 1 for Communication-Efficient Federated Learning with Binary Neural Networks
Figure 2 for Communication-Efficient Federated Learning with Binary Neural Networks
Figure 3 for Communication-Efficient Federated Learning with Binary Neural Networks
Figure 4 for Communication-Efficient Federated Learning with Binary Neural Networks

Federated learning (FL) is a privacy-preserving machine learning setting that enables many devices to jointly train a shared global model without the need to reveal their data to a central server. However, FL involves a frequent exchange of the parameters between all the clients and the server that coordinates the training. This introduces extensive communication overhead, which can be a major bottleneck in FL with limited communication links. In this paper, we consider training the binary neural networks (BNN) in the FL setting instead of the typical real-valued neural networks to fulfill the stringent delay and efficiency requirement in wireless edge networks. We introduce a novel FL framework of training BNN, where the clients only upload the binary parameters to the server. We also propose a novel parameter updating scheme based on the Maximum Likelihood (ML) estimation that preserves the performance of the BNN even without the availability of aggregated real-valued auxiliary parameters that are usually needed during the training of the BNN. Moreover, for the first time in the literature, we theoretically derive the conditions under which the training of BNN is converging. { Numerical results show that the proposed FL framework significantly reduces the communication cost compared to the conventional neural networks with typical real-valued parameters, and the performance loss incurred by the binarization can be further compensated by a hybrid method.

* Accepted for publication in IEEE Journal on Selected Areas in Communications 
Viaarxiv icon

Fractional order magnetic resonance fingerprinting in the human cerebral cortex

Jun 09, 2021
Viktor Vegh, Shahrzad Moinian, Qianqian Yang, David C. Reutens

Figure 1 for Fractional order magnetic resonance fingerprinting in the human cerebral cortex
Figure 2 for Fractional order magnetic resonance fingerprinting in the human cerebral cortex
Figure 3 for Fractional order magnetic resonance fingerprinting in the human cerebral cortex
Figure 4 for Fractional order magnetic resonance fingerprinting in the human cerebral cortex

Mathematical models are becoming increasingly important in magnetic resonance imaging (MRI), as they provide a mechanistic approach for making a link between tissue microstructure and signals acquired using the medical imaging instrument. The Bloch equations, which describes spin and relaxation in a magnetic field, is a set of integer order differential equations with a solution exhibiting mono-exponential behaviour in time. Parameters of the model may be estimated using a non-linear solver, or by creating a dictionary of model parameters from which MRI signals are simulated and then matched with experiment. We have previously shown the potential efficacy of a magnetic resonance fingerprinting (MRF) approach, i.e. dictionary matching based on the classical Bloch equations, for parcellating the human cerebral cortex. However, this classical model is unable to describe in full the mm-scale MRI signal generated based on an heterogenous and complex tissue micro-environment. The time-fractional order Bloch equations has been shown to provide, as a function of time, a good fit of brain MRI signals. We replaced the integer order Bloch equations with the previously reported time-fractional counterpart within the MRF framework and performed experiments to parcellate human gray matter, which is cortical brain tissue with different cyto-architecture at different spatial locations. Our findings suggest that the time-fractional order parameters, {\alpha} and {\beta}, potentially associate with the effect of interareal architectonic variability, hypothetically leading to more accurate cortical parcellation.

Viaarxiv icon