Alert button
Picture for Lei He

Lei He

Alert button

DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders

Jul 11, 2022
Yanqing Liu, Ruiqing Xue, Lei He, Xu Tan, Sheng Zhao

Figure 1 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Figure 2 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Figure 3 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Figure 4 for DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders
Viaarxiv icon

ReMix: A General and Efficient Framework for Multiple Instance Learning based Whole Slide Image Classification

Jul 05, 2022
Jiawei Yang, Hanbo Chen, Yu Zhao, Fan Yang, Yao Zhang, Lei He, Jianhua Yao

Figure 1 for ReMix: A General and Efficient Framework for Multiple Instance Learning based Whole Slide Image Classification
Figure 2 for ReMix: A General and Efficient Framework for Multiple Instance Learning based Whole Slide Image Classification
Figure 3 for ReMix: A General and Efficient Framework for Multiple Instance Learning based Whole Slide Image Classification
Figure 4 for ReMix: A General and Efficient Framework for Multiple Instance Learning based Whole Slide Image Classification
Viaarxiv icon

Self-supervised Context-aware Style Representation for Expressive Speech Synthesis

Jun 25, 2022
Yihan Wu, Xi Wang, Shaofei Zhang, Lei He, Ruihua Song, Jian-Yun Nie

Figure 1 for Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Figure 2 for Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Figure 3 for Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Figure 4 for Self-supervised Context-aware Style Representation for Expressive Speech Synthesis
Viaarxiv icon

BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis

May 30, 2022
Yichong Leng, Zehua Chen, Junliang Guo, Haohe Liu, Jiawei Chen, Xu Tan, Danilo Mandic, Lei He, Xiang-Yang Li, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Figure 2 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Figure 3 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Figure 4 for BinauralGrad: A Two-Stage Conditional Diffusion Probabilistic Model for Binaural Audio Synthesis
Viaarxiv icon

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

May 10, 2022
Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Frank Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 2 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 3 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 4 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Viaarxiv icon

AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios

Apr 01, 2022
Yihan Wu, Xu Tan, Bohan Li, Lei He, Sheng Zhao, Ruihua Song, Tao Qin, Tie-Yan Liu

Figure 1 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Figure 2 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Figure 3 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Figure 4 for AdaSpeech 4: Adaptive Text to Speech in Zero-Shot Scenarios
Viaarxiv icon

InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training

Feb 08, 2022
Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo Mandic, Lei He, Sheng Zhao

Figure 1 for InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Figure 2 for InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Figure 3 for InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Figure 4 for InferGrad: Improving Diffusion Models for Vocoder by Considering Inference in Training
Viaarxiv icon

Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training

Jan 20, 2022
J. Yang, Lei He

Figure 1 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Figure 2 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Figure 3 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Figure 4 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Viaarxiv icon

DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021

Nov 19, 2021
Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao

Figure 1 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Figure 2 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Figure 3 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Figure 4 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Viaarxiv icon

LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image

Nov 18, 2021
Moyun Liu, Youping Chen, Lei He, Yang Zhang, Jingming Xie

Figure 1 for LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image
Figure 2 for LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image
Figure 3 for LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image
Figure 4 for LF-YOLO: A Lighter and Faster YOLO for Weld Defect Detection of X-ray Image
Viaarxiv icon