Alert button
Picture for Sheng Zhao

Sheng Zhao

Alert button

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder

Add code
Bookmark button
Alert button
Mar 30, 2023
Chenpng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian

Figure 1 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Figure 2 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Figure 3 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Figure 4 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Viaarxiv icon

HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details

Add code
Bookmark button
Alert button
Mar 20, 2023
Zenghao Chai, Tianke Zhang, Tianyu He, Xu Tan, Tadas Baltrusaitis, HsiangTao Wu, Runnan Li, Sheng Zhao, Chun Yuan, Jiang Bian

Figure 1 for HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details
Figure 2 for HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details
Figure 3 for HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details
Figure 4 for HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details
Viaarxiv icon

FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model

Add code
Bookmark button
Alert button
Mar 08, 2023
Ruiqing Xue, Yanqing Liu, Lei He, Xu Tan, Linquan Liu, Edward Lin, Sheng Zhao

Figure 1 for FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Figure 2 for FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Figure 3 for FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Figure 4 for FoundationTTS: Text-to-Speech for ASR Customization with Generative Language Model
Viaarxiv icon

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

Add code
Bookmark button
Alert button
Mar 07, 2023
Ziqiang Zhang, Long Zhou, Chengyi Wang, Sanyuan Chen, Yu Wu, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei

Figure 1 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 2 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 3 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 4 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Viaarxiv icon

Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation

Add code
Bookmark button
Alert button
Feb 22, 2023
Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Sheng Zhao

Figure 1 for Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation
Figure 2 for Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation
Figure 3 for Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation
Viaarxiv icon

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

Add code
Bookmark button
Alert button
Jan 05, 2023
Chengyi Wang, Sanyuan Chen, Yu Wu, Ziqiang Zhang, Long Zhou, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei

Figure 1 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 2 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 3 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Figure 4 for Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Viaarxiv icon

ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech

Add code
Bookmark button
Alert button
Dec 30, 2022
Zehua Chen, Yihan Wu, Yichong Leng, Jiawei Chen, Haohe Liu, Xu Tan, Yang Cui, Ke Wang, Lei He, Sheng Zhao, Jiang Bian, Danilo Mandic

Figure 1 for ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Figure 2 for ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Figure 3 for ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Figure 4 for ResGrad: Residual Denoising Diffusion Probabilistic Models for Text to Speech
Viaarxiv icon

Memories are One-to-Many Mapping Alleviators in Talking Face Generation

Add code
Bookmark button
Alert button
Dec 12, 2022
Anni Tang, Tianyu He, Xu Tan, Jun Ling, Runnan Li, Sheng Zhao, Li Song, Jiang Bian

Figure 1 for Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Figure 2 for Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Figure 3 for Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Figure 4 for Memories are One-to-Many Mapping Alleviators in Talking Face Generation
Viaarxiv icon

VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing

Add code
Bookmark button
Alert button
Nov 30, 2022
Yihan Wu, Junliang Guo, Xu Tan, Chen Zhang, Bohan Li, Ruihua Song, Lei He, Sheng Zhao, Arul Menezes, Jiang Bian

Figure 1 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Figure 2 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Figure 3 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Figure 4 for VideoDubber: Machine Translation with Speech-Aware Length Control for Video Dubbing
Viaarxiv icon