Alert button

"speech": models, code, and papers
Alert button

Active Learning of Non-semantic Speech Tasks with Pretrained Models

Add code
Bookmark button
Alert button
Nov 03, 2022
Harlin Lee, Aaqib Saeed, Andrea L. Bertozzi

Figure 1 for Active Learning of Non-semantic Speech Tasks with Pretrained Models
Figure 2 for Active Learning of Non-semantic Speech Tasks with Pretrained Models
Figure 3 for Active Learning of Non-semantic Speech Tasks with Pretrained Models
Figure 4 for Active Learning of Non-semantic Speech Tasks with Pretrained Models
Viaarxiv icon

STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation

Add code
Bookmark button
Alert button
Mar 20, 2022
Qingkai Fang, Rong Ye, Lei Li, Yang Feng, Mingxuan Wang

Figure 1 for STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Figure 2 for STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Figure 3 for STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Figure 4 for STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation
Viaarxiv icon

Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation

Jan 25, 2023
Shahar Lutati, Eliya Nachmani, Lior Wolf

Figure 1 for Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
Figure 2 for Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
Figure 3 for Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
Figure 4 for Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation
Viaarxiv icon

Correcting Misproducted Speech using Spectrogram Inpainting

Add code
Bookmark button
Alert button
Apr 07, 2022
Talia Ben-Simon, Felix Kreuk, Faten Awwad, Jacob T. Cohen, Joseph Keshet

Figure 1 for Correcting Misproducted Speech using Spectrogram Inpainting
Figure 2 for Correcting Misproducted Speech using Spectrogram Inpainting
Viaarxiv icon

SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate

Add code
Bookmark button
Alert button
Jul 13, 2022
Nabarun Goswami, Tatsuya Harada

Figure 1 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Figure 2 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Figure 3 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Figure 4 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Viaarxiv icon

Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models

Add code
Bookmark button
Alert button
Jun 20, 2022
Paul Röttger, Haitham Seelawi, Debora Nozza, Zeerak Talat, Bertie Vidgen

Figure 1 for Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Figure 2 for Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Figure 3 for Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Figure 4 for Multilingual HateCheck: Functional Tests for Multilingual Hate Speech Detection Models
Viaarxiv icon

Automatic Prosody Annotation with Pre-Trained Text-Speech Model

Add code
Bookmark button
Alert button
Jun 16, 2022
Ziqian Dai, Jianwei Yu, Yan Wang, Nuo Chen, Yanyao Bian, Guangzhi Li, Deng Cai, Dong Yu

Figure 1 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Figure 2 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Figure 3 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Figure 4 for Automatic Prosody Annotation with Pre-Trained Text-Speech Model
Viaarxiv icon

Right the docs: Characterising voice dataset documentation practices used in machine learning

Add code
Bookmark button
Alert button
Mar 19, 2023
Kathy Reid, Elizabeth T. Williams

Figure 1 for Right the docs: Characterising voice dataset documentation practices used in machine learning
Figure 2 for Right the docs: Characterising voice dataset documentation practices used in machine learning
Figure 3 for Right the docs: Characterising voice dataset documentation practices used in machine learning
Figure 4 for Right the docs: Characterising voice dataset documentation practices used in machine learning
Viaarxiv icon

READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises

Add code
Bookmark button
Alert button
Feb 14, 2023
Chenglei Si, Zhengyan Zhang, Yingfa Chen, Xiaozhi Wang, Zhiyuan Liu, Maosong Sun

Figure 1 for READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
Figure 2 for READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
Figure 3 for READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
Figure 4 for READIN: A Chinese Multi-Task Benchmark with Realistic and Diverse Input Noises
Viaarxiv icon

TaylorAECNet: A Taylor Style Neural Network for Full-Band Echo Cancellation

Mar 11, 2023
Weiming Xu, Zhihao Guo

Figure 1 for TaylorAECNet: A Taylor Style Neural Network for Full-Band Echo Cancellation
Figure 2 for TaylorAECNet: A Taylor Style Neural Network for Full-Band Echo Cancellation
Figure 3 for TaylorAECNet: A Taylor Style Neural Network for Full-Band Echo Cancellation
Figure 4 for TaylorAECNet: A Taylor Style Neural Network for Full-Band Echo Cancellation
Viaarxiv icon