Alert button
Picture for Liangliang Cao

Liangliang Cao

Alert button

Exploiting Category Names for Few-Shot Classification with Vision-Language Models

Add code
Bookmark button
Alert button
Dec 04, 2022
Taihong Xiao, Zirui Wang, Liangliang Cao, Jiahui Yu, Shengyang Dai, Ming-Hsuan Yang

Figure 1 for Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Figure 2 for Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Figure 3 for Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Figure 4 for Exploiting Category Names for Few-Shot Classification with Vision-Language Models
Viaarxiv icon

SurFit: Learning to Fit Surfaces Improves Few Shot Learning on Point Clouds

Add code
Bookmark button
Alert button
Dec 27, 2021
Gopal Sharma, Bidya Dash, Matheus Gadelha, Aruni RoyChowdhury, Marios Loizou, Evangelos Kalogerakis, Liangliang Cao, Erik Learned-Miller, Rui Wang andSubhransu Maji

Figure 1 for SurFit: Learning to Fit Surfaces Improves Few Shot Learning on Point Clouds
Figure 2 for SurFit: Learning to Fit Surfaces Improves Few Shot Learning on Point Clouds
Figure 3 for SurFit: Learning to Fit Surfaces Improves Few Shot Learning on Point Clouds
Figure 4 for SurFit: Learning to Fit Surfaces Improves Few Shot Learning on Point Clouds
Viaarxiv icon

Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition

Add code
Bookmark button
Alert button
Oct 08, 2021
Zhiyun Lu, Yanwei Pan, Thibault Doutre, Liangliang Cao, Rohit Prabhavalkar, Chao Zhang, Trevor Strohman

Figure 1 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Figure 2 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Figure 3 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Figure 4 for Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition
Viaarxiv icon

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Oct 07, 2021
Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland

Figure 1 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Figure 2 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Figure 3 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Figure 4 for Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition
Viaarxiv icon

BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Oct 01, 2021
Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu

Figure 1 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 2 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 3 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Figure 4 for BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition
Viaarxiv icon

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction

Add code
Bookmark button
Alert button
Apr 26, 2021
David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw

Figure 1 for Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction
Figure 2 for Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction
Figure 3 for Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction
Figure 4 for Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction
Viaarxiv icon

Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models

Add code
Bookmark button
Alert button
Apr 25, 2021
Thibault Doutre, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Olivier Siohan, Liangliang Cao

Figure 1 for Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Figure 2 for Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Figure 3 for Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Figure 4 for Bridging the gap between streaming and non-streaming ASR systems bydistilling ensembles of CTC and RNN-T models
Viaarxiv icon

Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models

Add code
Bookmark button
Alert button
Apr 06, 2021
Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao

Figure 1 for Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models
Figure 2 for Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models
Figure 3 for Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models
Figure 4 for Exploring Targeted Universal Adversarial Perturbations to End-to-end ASR Models
Viaarxiv icon

Residual Energy-Based Models for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Mar 25, 2021
Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland

Figure 1 for Residual Energy-Based Models for End-to-End Speech Recognition
Figure 2 for Residual Energy-Based Models for End-to-End Speech Recognition
Figure 3 for Residual Energy-Based Models for End-to-End Speech Recognition
Figure 4 for Residual Energy-Based Models for End-to-End Speech Recognition
Viaarxiv icon