Alert button
Picture for Kaizhi Qian

Kaizhi Qian

Alert button

Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling

Nov 15, 2023
Bairu Hou, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang, Yang Zhang

Viaarxiv icon

Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning

Jun 23, 2023
Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Yonggan Fu, Yingyan Lin

Figure 1 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 2 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 3 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 4 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Viaarxiv icon

Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos

Apr 11, 2023
Kun Su, Kaizhi Qian, Eli Shlizerman, Antonio Torralba, Chuang Gan

Figure 1 for Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Figure 2 for Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Figure 3 for Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Figure 4 for Physics-Driven Diffusion Models for Impact Sound Synthesis from Videos
Viaarxiv icon

Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing

Nov 02, 2022
Yonggan Fu, Yang Zhang, Kaizhi Qian, Zhifan Ye, Zhongzhi Yu, Cheng-I Lai, Yingyan Lin

Figure 1 for Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Figure 2 for Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Figure 3 for Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Figure 4 for Losses Can Be Blessings: Routing Self-Supervised Speech Representations Towards Efficient Multilingual and Multitask Speech Processing
Viaarxiv icon

Improving Self-Supervised Speech Representations by Disentangling Speakers

Apr 20, 2022
Kaizhi Qian, Yang Zhang, Heting Gao, Junrui Ni, Cheng-I Lai, David Cox, Mark Hasegawa-Johnson, Shiyu Chang

Figure 1 for Improving Self-Supervised Speech Representations by Disentangling Speakers
Figure 2 for Improving Self-Supervised Speech Representations by Disentangling Speakers
Figure 3 for Improving Self-Supervised Speech Representations by Disentangling Speakers
Figure 4 for Improving Self-Supervised Speech Representations by Disentangling Speakers
Viaarxiv icon

WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models

Apr 14, 2022
Heting Gao, Junrui Ni, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson

Figure 1 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Figure 2 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Figure 3 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Figure 4 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Viaarxiv icon

Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition

Mar 29, 2022
Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson

Figure 1 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Figure 2 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Figure 3 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Figure 4 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Viaarxiv icon

SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks

Mar 26, 2022
Chak Ho Chan, Kaizhi Qian, Yang Zhang, Mark Hasegawa-Johnson

Figure 1 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Figure 2 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Figure 3 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Figure 4 for SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks
Viaarxiv icon