Alert button

"speech": models, code, and papers
Alert button

NLPositionality: Characterizing Design Biases of Datasets and Models

Add code
Bookmark button
Alert button
Jun 02, 2023
Sebastin Santy, Jenny T. Liang, Ronan Le Bras, Katharina Reinecke, Maarten Sap

Figure 1 for NLPositionality: Characterizing Design Biases of Datasets and Models
Figure 2 for NLPositionality: Characterizing Design Biases of Datasets and Models
Figure 3 for NLPositionality: Characterizing Design Biases of Datasets and Models
Figure 4 for NLPositionality: Characterizing Design Biases of Datasets and Models
Viaarxiv icon

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

Add code
Bookmark button
Alert button
Mar 07, 2023
Ziqiang Zhang, Long Zhou, Chengyi Wang, Sanyuan Chen, Yu Wu, Shujie Liu, Zhuo Chen, Yanqing Liu, Huaming Wang, Jinyu Li, Lei He, Sheng Zhao, Furu Wei

Figure 1 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 2 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 3 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Figure 4 for Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling
Viaarxiv icon

Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech

Aug 10, 2022
Kaitao Song, Teng Wan, Bixia Wang, Huiqiang Jiang, Luna Qiu, Jiahang Xu, Liping Jiang, Qun Lou, Yuqing Yang, Dongsheng Li, Xudong Wang, Lili Qiu

Figure 1 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 2 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 3 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Figure 4 for Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech
Viaarxiv icon

Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages

Feb 13, 2023
Sudhanshu Srivastava, Ishika Gupta, Anusha Prakash, Jom Kuriakose, Hema A. Murthy

Figure 1 for Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages
Figure 2 for Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages
Figure 3 for Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages
Figure 4 for Fast and small footprint Hybrid HMM-HiFiGAN based system for speech synthesis in Indian languages
Viaarxiv icon

Vicarious Offense and Noise Audit of Offensive Speech Classifiers

Add code
Bookmark button
Alert button
Feb 03, 2023
Tharindu Cyril Weerasooriya, Sujan Dutta, Tharindu Ranasinghe, Marcos Zampieri, Christopher M. Homan, Ashiqur R. KhudaBukhsh

Figure 1 for Vicarious Offense and Noise Audit of Offensive Speech Classifiers
Figure 2 for Vicarious Offense and Noise Audit of Offensive Speech Classifiers
Figure 3 for Vicarious Offense and Noise Audit of Offensive Speech Classifiers
Figure 4 for Vicarious Offense and Noise Audit of Offensive Speech Classifiers
Viaarxiv icon

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation

Dec 16, 2022
Rongzhi Gu, Shi-Xiong Zhang, Yuexian Zou, Dong Yu

Figure 1 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 2 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 3 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Figure 4 for Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation
Viaarxiv icon

Efficient Sequence Transduction by Jointly Predicting Tokens and Durations

Add code
Bookmark button
Alert button
Apr 13, 2023
Hainan Xu, Fei Jia, Somshubra Majumdar, He Huang, Shinji Watanabe, Boris Ginsburg

Figure 1 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 2 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 3 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Figure 4 for Efficient Sequence Transduction by Jointly Predicting Tokens and Durations
Viaarxiv icon

ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Oct 24, 2022
Sanchit Gandhi, Patrick von Platen, Alexander M. Rush

Figure 1 for ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Figure 2 for ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Figure 3 for ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Figure 4 for ESB: A Benchmark For Multi-Domain End-to-End Speech Recognition
Viaarxiv icon

Improving Autoregressive NLP Tasks via Modular Linearized Attention

Apr 17, 2023
Victor Agostinelli, Lizhong Chen

Figure 1 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 2 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 3 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 4 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Viaarxiv icon

Progressive Multi-Scale Self-Supervised Learning for Speech Recognition

Dec 07, 2022
Genshun Wan, Tan Liu, Hang Chen, Jia Pan, Cong Liu, Zhongfu Ye

Figure 1 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 2 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 3 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 4 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Viaarxiv icon