Alert button
Picture for Xiaorui Wang

Xiaorui Wang

Alert button

Jointly Recognizing Speech and Singing Voices Based on Multi-Task Audio Source Separation

Add code
Bookmark button
Alert button
Apr 17, 2024
Ye Bai, Chenxing Li, Hao Li, Yuanyuan Zhao, Xiaorui Wang

Viaarxiv icon

Filter Pruning via Filters Similarity in Consecutive Layers

Add code
Bookmark button
Alert button
Apr 26, 2023
Xiaorui Wang, Jun Wang, Xin Tang, Peng Gao, Rui Fang, Guotong Xie

Figure 1 for Filter Pruning via Filters Similarity in Consecutive Layers
Figure 2 for Filter Pruning via Filters Similarity in Consecutive Layers
Figure 3 for Filter Pruning via Filters Similarity in Consecutive Layers
Figure 4 for Filter Pruning via Filters Similarity in Consecutive Layers
Viaarxiv icon

Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis

Add code
Bookmark button
Alert button
Mar 14, 2023
Chunyu Qiang, Peng Yang, Hao Che, Ying Zhang, Xiaorui Wang, Zhongyuan Wang

Figure 1 for Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis
Figure 2 for Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis
Figure 3 for Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis
Figure 4 for Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis
Viaarxiv icon

Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis

Add code
Bookmark button
Alert button
Dec 13, 2022
Chunyu Qiang, Peng Yang, Hao Che, Xiaorui Wang, Zhongyuan Wang

Figure 1 for Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Figure 2 for Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Figure 3 for Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Figure 4 for Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis
Viaarxiv icon

Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation

Add code
Bookmark button
Alert button
Nov 17, 2022
Chunyu Qiang, Peng Yang, Hao Che, Jinba Xiao, Xiaorui Wang, Zhongyuan Wang

Figure 1 for Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation
Figure 2 for Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation
Figure 3 for Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation
Figure 4 for Back-Translation-Style Data Augmentation for Mandarin Chinese Polyphone Disambiguation
Viaarxiv icon

Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Sep 17, 2022
Ye Bai, Jie Li, Wenjing Han, Hao Ni, Kaituo Xu, Zhuo Zhang, Cheng Yi, Xiaorui Wang

Figure 1 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Figure 2 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Figure 3 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Figure 4 for Parameter-Efficient Conformers via Sharing Sparsely-Gated Experts for End-to-End Speech Recognition
Viaarxiv icon

MELONS: generating melody with long-term structure using transformers and structure graph

Add code
Bookmark button
Alert button
Nov 03, 2021
Yi Zou, Pei Zou, Yi Zhao, Kaixiang Zhang, Ran Zhang, Xiaorui Wang

Figure 1 for MELONS: generating melody with long-term structure using transformers and structure graph
Figure 2 for MELONS: generating melody with long-term structure using transformers and structure graph
Figure 3 for MELONS: generating melody with long-term structure using transformers and structure graph
Figure 4 for MELONS: generating melody with long-term structure using transformers and structure graph
Viaarxiv icon

SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification

Add code
Bookmark button
Alert button
Sep 18, 2021
Wentao Zhu, Tianlong Kong, Shun Lu, Jixiang Li, Dawei Zhang, Feng Deng, Xiaorui Wang, Sen Yang, Ji Liu

Figure 1 for SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification
Figure 2 for SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification
Figure 3 for SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification
Figure 4 for SpeechNAS: Towards Better Trade-off between Latency and Accuracy for Large-Scale Speaker Verification
Viaarxiv icon

Dynamic Multi-scale Convolution for Dialect Identification

Add code
Bookmark button
Alert button
Aug 02, 2021
Tianlong Kong, Shouyi Yin, Dawei Zhang, Wang Geng, Xin Wang, Dandan Song, Jinwen Huang, Huiyu Shi, Xiaorui Wang

Figure 1 for Dynamic Multi-scale Convolution for Dialect Identification
Figure 2 for Dynamic Multi-scale Convolution for Dialect Identification
Figure 3 for Dynamic Multi-scale Convolution for Dialect Identification
Figure 4 for Dynamic Multi-scale Convolution for Dialect Identification
Viaarxiv icon