Alert button

"speech": models, code, and papers
Alert button

Speech Synthesis with Mixed Emotions

Aug 11, 2022
Kun Zhou, Berrak Sisman, Rajib Rana, B. W. Schuller, Haizhou Li

Figure 1 for Speech Synthesis with Mixed Emotions
Figure 2 for Speech Synthesis with Mixed Emotions
Figure 3 for Speech Synthesis with Mixed Emotions
Figure 4 for Speech Synthesis with Mixed Emotions
Viaarxiv icon

Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement

Dec 18, 2022
Xiang Hao, Xiaofei Li

Figure 1 for Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement
Figure 2 for Fast FullSubNet: Accelerate Full-band and Sub-band Fusion Model for Single-channel Speech Enhancement
Viaarxiv icon

Investigating the effect of domain selection on automatic speech recognition performance: a case study on Bangladeshi Bangla

Oct 24, 2022
Ahnaf Mozib Samin, M. Humayan Kobir, Md. Mushtaq Shahriyar Rafee, M. Firoz Ahmed, Shafkat Kibria, M. Shahidur Rahman

Figure 1 for Investigating the effect of domain selection on automatic speech recognition performance: a case study on Bangladeshi Bangla
Figure 2 for Investigating the effect of domain selection on automatic speech recognition performance: a case study on Bangladeshi Bangla
Figure 3 for Investigating the effect of domain selection on automatic speech recognition performance: a case study on Bangladeshi Bangla
Figure 4 for Investigating the effect of domain selection on automatic speech recognition performance: a case study on Bangladeshi Bangla
Viaarxiv icon

Small-footprint slimmable networks for keyword spotting

Apr 21, 2023
Zuhaib Akhtar, Mohammad Omar Khursheed, Dongsu Du, Yuzong Liu

Figure 1 for Small-footprint slimmable networks for keyword spotting
Figure 2 for Small-footprint slimmable networks for keyword spotting
Figure 3 for Small-footprint slimmable networks for keyword spotting
Figure 4 for Small-footprint slimmable networks for keyword spotting
Viaarxiv icon

Surrogate Gradient Spiking Neural Networks as Encoders for Large Vocabulary Continuous Speech Recognition

Dec 01, 2022
Alexandre Bittar, Philip N. Garner

Figure 1 for Surrogate Gradient Spiking Neural Networks as Encoders for Large Vocabulary Continuous Speech Recognition
Figure 2 for Surrogate Gradient Spiking Neural Networks as Encoders for Large Vocabulary Continuous Speech Recognition
Figure 3 for Surrogate Gradient Spiking Neural Networks as Encoders for Large Vocabulary Continuous Speech Recognition
Figure 4 for Surrogate Gradient Spiking Neural Networks as Encoders for Large Vocabulary Continuous Speech Recognition
Viaarxiv icon

Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline

Nov 29, 2022
Paul-Gauthier Noé, Xiaoxiao Miao, Xin Wang, Junichi Yamagishi, Jean-François Bonastre, Driss Matrouf

Figure 1 for Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline
Figure 2 for Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline
Figure 3 for Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline
Figure 4 for Hiding speaker's sex in speech using zero-evidence speaker representation in an analysis/synthesis pipeline
Viaarxiv icon

Distance-based Weight Transfer for Fine-tuning from Near-field to Far-field Speaker Verification

Mar 01, 2023
Li Zhang, Qing Wang, Hongji Wang, Yue Li, Wei Rao, Yannan Wang, Lei Xie

Figure 1 for Distance-based Weight Transfer for Fine-tuning from Near-field to Far-field Speaker Verification
Figure 2 for Distance-based Weight Transfer for Fine-tuning from Near-field to Far-field Speaker Verification
Figure 3 for Distance-based Weight Transfer for Fine-tuning from Near-field to Far-field Speaker Verification
Viaarxiv icon

ToxVis: Enabling Interpretability of Implicit vs. Explicit Toxicity Detection Models with Interactive Visualization

Mar 01, 2023
Uma Gunturi, Xiaohan Ding, Eugenia H. Rho

Figure 1 for ToxVis: Enabling Interpretability of Implicit vs. Explicit Toxicity Detection Models with Interactive Visualization
Figure 2 for ToxVis: Enabling Interpretability of Implicit vs. Explicit Toxicity Detection Models with Interactive Visualization
Viaarxiv icon

Towards Disentangled Speech Representations

Aug 28, 2022
Cal Peyser, Ronny Huang Andrew Rosenberg Tara N. Sainath, Michael Picheny, Kyunghyun Cho

Figure 1 for Towards Disentangled Speech Representations
Figure 2 for Towards Disentangled Speech Representations
Figure 3 for Towards Disentangled Speech Representations
Figure 4 for Towards Disentangled Speech Representations
Viaarxiv icon

Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech

Jan 27, 2023
Jarod Govers, Philip Feldman, Aaron Dant, Panos Patros

Figure 1 for Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
Figure 2 for Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
Figure 3 for Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
Figure 4 for Down the Rabbit Hole: Detecting Online Extremism, Radicalisation, and Politicised Hate Speech
Viaarxiv icon