Alert button
Picture for Haoyu Li

Haoyu Li

Alert button

Hybrid HMM Decoder For Convolutional Codes By Joint Trellis-Like Structure and Channel Prior

Nov 12, 2022
Haoyu Li, Xuan Wang, Tong Liu, Dingyi Fang, Baoying Liu

Figure 1 for Hybrid HMM Decoder For Convolutional Codes By Joint Trellis-Like Structure and Channel Prior
Figure 2 for Hybrid HMM Decoder For Convolutional Codes By Joint Trellis-Like Structure and Channel Prior
Figure 3 for Hybrid HMM Decoder For Convolutional Codes By Joint Trellis-Like Structure and Channel Prior
Figure 4 for Hybrid HMM Decoder For Convolutional Codes By Joint Trellis-Like Structure and Channel Prior

The anti-interference capability of wireless links is a physical layer problem for edge computing. Although convolutional codes have inherent error correction potential due to the redundancy introduced in the data, the performance of the convolutional code is drastically degraded due to multipath effects on the channel. In this paper, we propose the use of a Hidden Markov Model (HMM) for the reconstruction of convolutional codes and decoding by the Viterbi algorithm. Furthermore, to implement soft-decision decoding, the observation of HMM is replaced by Gaussian mixture models (GMM). Our method provides superior error correction potential than the standard method because the model parameters contain channel state information (CSI). We evaluated the performance of the method compared to standard Viterbi decoding by numerical simulation. In the multipath channel, the hybrid HMM decoder can achieve a performance gain of 4.7 dB and 2 dB when using hard-decision and soft-decision decoding, respectively. The HMM decoder also achieves significant performance gains for the RSC code, suggesting that the method could be extended to turbo codes.

* IEEE Transactions on Cognitive Communications and Networking, Early-Access, 2022  
* 12 pages, 8 figures 
Viaarxiv icon

IDLat: An Importance-Driven Latent Generation Method for Scientific Data

Aug 05, 2022
Jingyi Shen, Haoyu Li, Jiayi Xu, Ayan Biswas, Han-Wei Shen

Figure 1 for IDLat: An Importance-Driven Latent Generation Method for Scientific Data
Figure 2 for IDLat: An Importance-Driven Latent Generation Method for Scientific Data
Figure 3 for IDLat: An Importance-Driven Latent Generation Method for Scientific Data
Figure 4 for IDLat: An Importance-Driven Latent Generation Method for Scientific Data

Deep learning based latent representations have been widely used for numerous scientific visualization applications such as isosurface similarity analysis, volume rendering, flow field synthesis, and data reduction, just to name a few. However, existing latent representations are mostly generated from raw data in an unsupervised manner, which makes it difficult to incorporate domain interest to control the size of the latent representations and the quality of the reconstructed data. In this paper, we present a novel importance-driven latent representation to facilitate domain-interest-guided scientific data visualization and analysis. We utilize spatial importance maps to represent various scientific interests and take them as the input to a feature transformation network to guide latent generation. We further reduced the latent size by a lossless entropy encoding algorithm trained together with the autoencoder, improving the storage and memory efficiency. We qualitatively and quantitatively evaluate the effectiveness and efficiency of latent representations generated by our method with data from multiple scientific visualization applications.

* 11 pages, 12 figures, Proc. IEEE VIS 2022 
Viaarxiv icon

VDL-Surrogate: A View-Dependent Latent-based Model for Parameter Space Exploration of Ensemble Simulations

Jul 29, 2022
Neng Shi, Jiayi Xu, Haoyu Li, Hanqi Guo, Jonathan Woodring, Han-Wei Shen

Figure 1 for VDL-Surrogate: A View-Dependent Latent-based Model for Parameter Space Exploration of Ensemble Simulations
Figure 2 for VDL-Surrogate: A View-Dependent Latent-based Model for Parameter Space Exploration of Ensemble Simulations
Figure 3 for VDL-Surrogate: A View-Dependent Latent-based Model for Parameter Space Exploration of Ensemble Simulations
Figure 4 for VDL-Surrogate: A View-Dependent Latent-based Model for Parameter Space Exploration of Ensemble Simulations

We propose VDL-Surrogate, a view-dependent neural-network-latent-based surrogate model for parameter space exploration of ensemble simulations that allows high-resolution visualizations and user-specified visual mappings. Surrogate-enabled parameter space exploration allows domain scientists to preview simulation results without having to run a large number of computationally costly simulations. Limited by computational resources, however, existing surrogate models may not produce previews with sufficient resolution for visualization and analysis. To improve the efficient use of computational resources and support high-resolution exploration, we perform ray casting from different viewpoints to collect samples and produce compact latent representations. This latent encoding process reduces the cost of surrogate model training while maintaining the output quality. In the model training stage, we select viewpoints to cover the whole viewing sphere and train corresponding VDL-Surrogate models for the selected viewpoints. In the model inference stage, we predict the latent representations at previously selected viewpoints and decode the latent representations to data space. For any given viewpoint, we make interpolations over decoded data at selected viewpoints and generate visualizations with user-specified visual mappings. We show the effectiveness and efficiency of VDL-Surrogate in cosmological and ocean simulations with quantitative and qualitative evaluations. Source code is publicly available at https://github.com/trainsn/VDL-Surrogate.

* Accepted by IEEE Transactions on Visualization and Computer Graphics (Proc. IEEE VIS 2022) 
Viaarxiv icon

Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement

Mar 22, 2022
Haoyu Li, Yun Liu, Junichi Yamagishi

Figure 1 for Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement
Figure 2 for Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement
Figure 3 for Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement
Figure 4 for Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement

Speech enhancement (SE) methods mainly focus on recovering clean speech from noisy input. In real-world speech communication, however, noises often exist in not only speaker but also listener environments. Although SE methods can suppress the noise contained in the speaker's voice, they cannot deal with the noise that is physically present in the listener side. To address such a complicated but common scenario, we investigate a deep learning-based joint framework integrating noise reduction (NR) with listening enhancement (LE), in which the NR module first suppresses noise and the LE module then modifies the denoised speech, i.e., the output of the NR module, to further improve speech intelligibility. The enhanced speech can thus be less noisy and more intelligible for listeners. Experimental results show that our proposed method achieves promising results and significantly outperforms the disjoint processing methods in terms of various speech evaluation metrics.

* Submitted to Interspeech 2022 
Viaarxiv icon

DDS: A new device-degraded speech dataset for speech enhancement

Sep 28, 2021
Haoyu Li, Junichi Yamagishi

Figure 1 for DDS: A new device-degraded speech dataset for speech enhancement
Figure 2 for DDS: A new device-degraded speech dataset for speech enhancement
Figure 3 for DDS: A new device-degraded speech dataset for speech enhancement
Figure 4 for DDS: A new device-degraded speech dataset for speech enhancement

A large and growing amount of speech content in real-life scenarios is being recorded on common consumer devices in uncontrolled environments, resulting in degraded speech quality. Transforming such low-quality device-degraded speech into high-quality speech is a goal of speech enhancement (SE). This paper introduces a new speech dataset, DDS, to facilitate the research on SE. DDS provides aligned parallel recordings of high-quality speech (recorded in professional studios) and a number of versions of low-quality speech, producing approximately 2,000 hours speech data. The DDS dataset covers 27 realistic recording conditions by combining diverse acoustic environments and microphone devices, and each version of a condition consists of multiple recordings from six different microphone positions to simulate various signal-to-noise ratio (SNR) and reverberation levels. We also test several SE baseline systems on the DDS dataset and show the impact of recording diversity on performance.

* Submitted to ICASSP 2022 
Viaarxiv icon

Time Varying Particle Data Feature Extraction and Tracking with Neural Networks

May 27, 2021
Haoyu Li, Han-Wei Shen

Figure 1 for Time Varying Particle Data Feature Extraction and Tracking with Neural Networks
Figure 2 for Time Varying Particle Data Feature Extraction and Tracking with Neural Networks
Figure 3 for Time Varying Particle Data Feature Extraction and Tracking with Neural Networks
Figure 4 for Time Varying Particle Data Feature Extraction and Tracking with Neural Networks

Analyzing particle data plays an important role in many scientific applications such as fluid simulation, cosmology simulation and molecular dynamics. While there exist methods that can perform feature extraction and tracking for volumetric data, performing those tasks for particle data is more challenging because of the lack of explicit connectivity information. Although one may convert the particle data to volume first, this approach is at risk of incurring error and increasing the size of the data. In this paper, we take a deep learning approach to create feature representations for scientific particle data to assist feature extraction and tracking. We employ a deep learning model, which produces latent vectors to represent the relation between spatial locations and physical attributes in a local neighborhood. With the latent vectors, features can be extracted by clustering these vectors. To achieve fast feature tracking, the mean-shift tracking algorithm is applied in the feature space, which only requires inference of the latent vector for selected regions of interest. We validate our approach using two datasets and compare our method with other existing methods.

Viaarxiv icon

Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

Apr 17, 2021
Haoyu Li, Junichi Yamagishi

Figure 1 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Figure 2 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Figure 3 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement
Figure 4 for Multi-Metric Optimization using Generative Adversarial Networks for Near-End Speech Intelligibility Enhancement

The intelligibility of speech severely degrades in the presence of environmental noise and reverberation. In this paper, we propose a novel deep learning based system for modifying the speech signal to increase its intelligibility under the equal-power constraint, i.e., signal power before and after modification must be the same. To achieve this, we use generative adversarial networks (GANs) to obtain time-frequency dependent amplification factors, which are then applied to the input raw speech to reallocate the speech energy. Instead of optimizing only a single, simple metric, we train a deep neural network (DNN) model to simultaneously optimize multiple advanced speech metrics, including both intelligibility- and quality-related ones, which results in notable improvements in performance and robustness. Our system can not only work in non-realtime mode for offline audio playback but also support practical real-time speech applications. Experimental results using both objective measurements and subjective listening tests indicate that the proposed system significantly outperforms state-ofthe-art baseline systems under various noisy and reverberant listening conditions.

* Submitted to IEEE/ACM Transactions on Audio Speech and Language Processing 
Viaarxiv icon

NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation

Apr 19, 2019
Subhashis Hazarika, Haoyu Li, Ko-Chih Wang, Han-Wei Shen, Ching-Shan Chou

Figure 1 for NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation
Figure 2 for NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation
Figure 3 for NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation
Figure 4 for NNVA: Neural Network Assisted Visual Analysis of Yeast Cell Polarization Simulation

Complex computational models are often designed to simulate real-world physical phenomena in many scientific disciplines. However, these simulation models tend to be computationally very expensive and involve a large number of simulation input parameters which need to be analyzed and properly calibrated before the models can be applied for real scientific studies. We propose a visual analysis system to facilitate interactive exploratory analysis of high-dimensional input parameter space for a complex yeast cell polarization simulation. The proposed system can assist the computational biologists, who designed the simulation model, to visually calibrate the input parameters by modifying the parameter values and immediately visualizing the predicted simulation outcome without having the need to run the original expensive simulation for every instance. Our proposed visual analysis system is driven by a trained neural network-based surrogate model as the backend analysis framework. Surrogate models are widely used in the field of simulation sciences to efficiently analyze computationally expensive simulation models. In this work, we demonstrate the advantage of using neural networks as surrogate models for visual analysis by incorporating some of the recent advances in the field of uncertainty quantification, interpretability and explainability of neural network-based models. We utilize the trained network to perform interactive parameter sensitivity analysis of the original simulation at multiple levels-of-detail as well as recommend optimal parameter configurations using the activation maximization framework of neural networks. We also facilitate detail analysis of the trained network to extract useful insights about the simulation model, learned by the network, during the training process.

* Under Review 
Viaarxiv icon