Alert button

"speech": models, code, and papers
Alert button

Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition

Apr 03, 2023
Saumya Y. Sahai, Jing Liu, Thejaswi Muniyappa, Kanthashree M. Sathyendra, Anastasios Alexandridis, Grant P. Strimel, Ross McGowan, Ariya Rastrow, Feng-Ju Chang, Athanasios Mouchtaris, Siegfried Kunzmann

Figure 1 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 2 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 3 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 4 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Viaarxiv icon

DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion

May 25, 2023
Ha-Yeong Choi, Sang-Hoon Lee, Seong-Whan Lee

Figure 1 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Figure 2 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Figure 3 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Figure 4 for DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion
Viaarxiv icon

Emotion Selectable End-to-End Text-based Speech Editing

Dec 20, 2022
Tao Wang, Jiangyan Yi, Ruibo Fu, Jianhua Tao, Zhengqi Wen, Chu Yuan Zhang

Figure 1 for Emotion Selectable End-to-End Text-based Speech Editing
Figure 2 for Emotion Selectable End-to-End Text-based Speech Editing
Figure 3 for Emotion Selectable End-to-End Text-based Speech Editing
Figure 4 for Emotion Selectable End-to-End Text-based Speech Editing
Viaarxiv icon

Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities

Jul 04, 2023
Riccardo Orlando, Simone Conia, Roberto Navigli

Figure 1 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Figure 2 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Figure 3 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Figure 4 for Exploring Non-Verbal Predicates in Semantic Role Labeling: Challenges and Opportunities
Viaarxiv icon

Self-supervised speech representation learning for keyword-spotting with light-weight transformers

Mar 07, 2023
Chenyang Gao, Yue Gu, Francesco Caliva, Yuzong Liu

Figure 1 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Figure 2 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Figure 3 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Figure 4 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Viaarxiv icon

Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations

Nov 14, 2022
Renee Lu, Mostafa Shahin, Beena Ahmed

Figure 1 for Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Figure 2 for Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Figure 3 for Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Figure 4 for Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Viaarxiv icon

Differentially Private Adapters for Parameter Efficient Acoustic Modeling

May 19, 2023
Chun-Wei Ho, Chao-Han Huck Yang, Sabato Marco Siniscalchi

Figure 1 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Figure 2 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Figure 3 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Figure 4 for Differentially Private Adapters for Parameter Efficient Acoustic Modeling
Viaarxiv icon

LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models

Mar 23, 2023
Teerapat Jenrungrot, Michael Chinen, W. Bastiaan Kleijn, Jan Skoglund, Zalán Borsos, Neil Zeghidour, Marco Tagliasacchi

Figure 1 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Figure 2 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Figure 3 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Figure 4 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Viaarxiv icon

Voice Conversion With Just Nearest Neighbors

May 30, 2023
Matthew Baas, Benjamin van Niekerk, Herman Kamper

Figure 1 for Voice Conversion With Just Nearest Neighbors
Figure 2 for Voice Conversion With Just Nearest Neighbors
Figure 3 for Voice Conversion With Just Nearest Neighbors
Viaarxiv icon

A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem

May 30, 2023
Sebastian P. Bayerl, Dominik Wagner, Ilja Baumann, Florian Hönig, Tobias Bocklet, Elmar Nöth, Korbinian Riedhammer

Figure 1 for A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem
Figure 2 for A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem
Figure 3 for A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem
Figure 4 for A Stutter Seldom Comes Alone -- Cross-Corpus Stuttering Detection as a Multi-label Problem
Viaarxiv icon