Alert button

"speech": models, code, and papers
Alert button

Building a Non-native Speech Corpus Featuring Chinese-English Bilingual Children: Compilation and Rationale

Apr 30, 2023
Hiuchung Hung, Andreas Maier, Thorsten Piske

Figure 1 for Building a Non-native Speech Corpus Featuring Chinese-English Bilingual Children: Compilation and Rationale
Figure 2 for Building a Non-native Speech Corpus Featuring Chinese-English Bilingual Children: Compilation and Rationale
Viaarxiv icon

Deep Learning-based F0 Synthesis for Speaker Anonymization

Jun 29, 2023
Ünal Ege Gaznepoglu, Nils Peters

Figure 1 for Deep Learning-based F0 Synthesis for Speaker Anonymization
Figure 2 for Deep Learning-based F0 Synthesis for Speaker Anonymization
Figure 3 for Deep Learning-based F0 Synthesis for Speaker Anonymization
Figure 4 for Deep Learning-based F0 Synthesis for Speaker Anonymization
Viaarxiv icon

Boosting Punctuation Restoration with Data Generation and Reinforcement Learning

Jul 24, 2023
Viet Dac Lai, Abel Salinas, Hao Tan, Trung Bui, Quan Tran, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, Thien Huu Nguyen

Viaarxiv icon

Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition

Jun 09, 2023
Xianzhao Chen, Yist Y. Lin, Kang Wang, Yi He, Zejun Ma

Figure 1 for Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Figure 2 for Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Figure 3 for Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Figure 4 for Improving Frame-level Classifier for Word Timings with Non-peaky CTC in End-to-End Automatic Speech Recognition
Viaarxiv icon

Regularizing Contrastive Predictive Coding for Speech Applications

Apr 12, 2023
Saurabhchand Bhati, Jesús Villalba, Piotr Żelasko, Laureano Moro-Velazquez, Najim Dehak

Figure 1 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 2 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 3 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 4 for Regularizing Contrastive Predictive Coding for Speech Applications
Viaarxiv icon

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders

May 18, 2023
Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji

Figure 1 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Figure 2 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Figure 3 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Figure 4 for Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders
Viaarxiv icon

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

Jun 01, 2023
Pin-Jui Ku, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Figure 1 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 2 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 3 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Figure 4 for A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models
Viaarxiv icon

Accelerating Transducers through Adjacent Token Merging

Jun 28, 2023
Yuang Li, Yu Wu, Jinyu Li, Shujie Liu

Figure 1 for Accelerating Transducers through Adjacent Token Merging
Figure 2 for Accelerating Transducers through Adjacent Token Merging
Figure 3 for Accelerating Transducers through Adjacent Token Merging
Figure 4 for Accelerating Transducers through Adjacent Token Merging
Viaarxiv icon

A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model

May 19, 2023
Ibrahim Malik, Siddique Latif, Raja Jurdak, Björn Schuller

Figure 1 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Figure 2 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Figure 3 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Figure 4 for A Preliminary Study on Augmenting Speech Emotion Recognition using a Diffusion Model
Viaarxiv icon

Developmental Bootstrapping of AIs

Aug 11, 2023
Mark Stefik, Robert Price

Figure 1 for Developmental Bootstrapping of AIs
Figure 2 for Developmental Bootstrapping of AIs
Figure 3 for Developmental Bootstrapping of AIs
Figure 4 for Developmental Bootstrapping of AIs
Viaarxiv icon