Alert button

"speech": models, code, and papers
Alert button

Improving EEG based Continuous Speech Recognition

Dec 24, 2019
Gautam Krishna, Co Tran, Mason Carnahan, Yan Han, Ahmed H Tewfik

Figure 1 for Improving EEG based Continuous Speech Recognition
Figure 2 for Improving EEG based Continuous Speech Recognition
Figure 3 for Improving EEG based Continuous Speech Recognition
Figure 4 for Improving EEG based Continuous Speech Recognition
Viaarxiv icon

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

Add code
Bookmark button
Alert button
Jul 02, 2020
Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

Figure 1 for Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
Figure 2 for Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
Figure 3 for Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
Figure 4 for Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
Viaarxiv icon

Towards Learning a Universal Non-Semantic Representation of Speech

Add code
Bookmark button
Alert button
Feb 25, 2020
Joel Shor, Aren Jansen, Ronnie Maor, Oran Lang, Felix de Chaumont Quitry, Marco Tagliasacchi, Omry Tuval, Ira Shavitt, Dotan Emanuel, Yinnon Haviv

Figure 1 for Towards Learning a Universal Non-Semantic Representation of Speech
Figure 2 for Towards Learning a Universal Non-Semantic Representation of Speech
Figure 3 for Towards Learning a Universal Non-Semantic Representation of Speech
Figure 4 for Towards Learning a Universal Non-Semantic Representation of Speech
Viaarxiv icon

Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech

Add code
Bookmark button
Alert button
May 19, 2020
Wenjie Li, Benlai Tang, Xiang Yin, Yushi Zhao, Wei Li, Kang Wang, Hao Huang, Yuxuan Wang, Zejun Ma

Figure 1 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Figure 2 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Figure 3 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Figure 4 for Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech
Viaarxiv icon

Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data

Add code
Bookmark button
Alert button
Oct 14, 2021
Haitong Zhang, Yue Lin

Figure 1 for Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Figure 2 for Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Figure 3 for Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Figure 4 for Improve Cross-lingual Voice Cloning Using Low-quality Code-switched Data
Viaarxiv icon

Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition

Apr 26, 2020
Li Fu, Xiaoxiao Li, Libo Zi

Figure 1 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Figure 2 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Figure 3 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Figure 4 for Research on Modeling Units of Transformer Transducer for Mandarin Speech Recognition
Viaarxiv icon

Nonlinear Spatial Filtering in Multichannel Speech Enhancement

Apr 22, 2021
Kristina Tesch, Timo Gerkmann

Figure 1 for Nonlinear Spatial Filtering in Multichannel Speech Enhancement
Figure 2 for Nonlinear Spatial Filtering in Multichannel Speech Enhancement
Viaarxiv icon

Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast

Add code
Bookmark button
Alert button
Feb 19, 2021
Satvik Venkatesh, David Moffat, Alexis Kirke, Gözel Shakeri, Stephen Brewster, Jörg Fachner, Helen Odell-Miller, Alex Street, Nicolas Farina, Sube Banerjee, Eduardo Reck Miranda

Figure 1 for Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast
Figure 2 for Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast
Figure 3 for Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast
Figure 4 for Artificially Synthesising Data for Audio Classification and Segmentation to Improve Speech and Music Detection in Radio Broadcast
Viaarxiv icon

Word Discovery in Visually Grounded, Self-Supervised Speech Models

Add code
Bookmark button
Alert button
Mar 28, 2022
Puyuan Peng, David Harwath

Figure 1 for Word Discovery in Visually Grounded, Self-Supervised Speech Models
Figure 2 for Word Discovery in Visually Grounded, Self-Supervised Speech Models
Figure 3 for Word Discovery in Visually Grounded, Self-Supervised Speech Models
Figure 4 for Word Discovery in Visually Grounded, Self-Supervised Speech Models
Viaarxiv icon