Alert button

"speech": models, code, and papers
Alert button

SEANet: A Multi-modal Speech Enhancement Network

Sep 04, 2020
Marco Tagliasacchi, Yunpeng Li, Karolis Misiunas, Dominik Roblek

Figure 1 for SEANet: A Multi-modal Speech Enhancement Network
Figure 2 for SEANet: A Multi-modal Speech Enhancement Network
Figure 3 for SEANet: A Multi-modal Speech Enhancement Network
Figure 4 for SEANet: A Multi-modal Speech Enhancement Network
Viaarxiv icon

Speech Synthesis using EEG

Feb 22, 2020
Gautam Krishna, Co Tran, Yan Han, Mason Carnahan

Figure 1 for Speech Synthesis using EEG
Figure 2 for Speech Synthesis using EEG
Figure 3 for Speech Synthesis using EEG
Figure 4 for Speech Synthesis using EEG
Viaarxiv icon

An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation

Mar 07, 2021
Kamini Sabu, Syomantak Chaudhuri, Preeti Rao, Mahesh Patil

Figure 1 for An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation
Figure 2 for An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation
Figure 3 for An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation
Figure 4 for An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation
Viaarxiv icon

TweetBLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter

Aug 27, 2021
Sumit Kumar, Raj Ratn Pranesh

Figure 1 for TweetBLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter
Figure 2 for TweetBLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter
Figure 3 for TweetBLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter
Figure 4 for TweetBLM: A Hate Speech Dataset and Analysis of Black Lives Matter-related Microblogs on Twitter
Viaarxiv icon

BART based semantic correction for Mandarin automatic speech recognition system

Mar 26, 2021
Yun Zhao, Xuerui Yang, Jinchao Wang, Yongyu Gao, Chao Yan, Yuanfu Zhou

Figure 1 for BART based semantic correction for Mandarin automatic speech recognition system
Figure 2 for BART based semantic correction for Mandarin automatic speech recognition system
Figure 3 for BART based semantic correction for Mandarin automatic speech recognition system
Figure 4 for BART based semantic correction for Mandarin automatic speech recognition system
Viaarxiv icon

MLS: A Large-Scale Multilingual Dataset for Speech Research

Dec 19, 2020
Vineel Pratap, Qiantong Xu, Anuroop Sriram, Gabriel Synnaeve, Ronan Collobert

Figure 1 for MLS: A Large-Scale Multilingual Dataset for Speech Research
Figure 2 for MLS: A Large-Scale Multilingual Dataset for Speech Research
Figure 3 for MLS: A Large-Scale Multilingual Dataset for Speech Research
Figure 4 for MLS: A Large-Scale Multilingual Dataset for Speech Research
Viaarxiv icon

Language ID Prediction from Speech Using Self-Attentive Pooling and 1D-Convolutions

Apr 24, 2021
Roman Bedyakin, Nikolay Mikhaylovskiy

Figure 1 for Language ID Prediction from Speech Using Self-Attentive Pooling and 1D-Convolutions
Figure 2 for Language ID Prediction from Speech Using Self-Attentive Pooling and 1D-Convolutions
Figure 3 for Language ID Prediction from Speech Using Self-Attentive Pooling and 1D-Convolutions
Figure 4 for Language ID Prediction from Speech Using Self-Attentive Pooling and 1D-Convolutions
Viaarxiv icon

NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation

Mar 05, 2022
Tao Wang, Ruibo Fu, Jiangyan Yi, Jianhua Tao, Zhengqi Wen

Figure 1 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Figure 2 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Figure 3 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Figure 4 for NeuralDPS: Neural Deterministic Plus Stochastic Model with Multiband Excitation for Noise-Controllable Waveform Generation
Viaarxiv icon

Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT

May 15, 2022
Bowen Shi, Abdelrahman Mohamed, Wei-Ning Hsu

Figure 1 for Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Figure 2 for Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Figure 3 for Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Figure 4 for Learning Lip-Based Audio-Visual Speaker Embeddings with AV-HuBERT
Viaarxiv icon

A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture

Apr 12, 2022
Zhenxing Lu, Mengnan He, Ruixiong Zhang, Caixia Gong

Figure 1 for A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Figure 2 for A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Figure 3 for A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Figure 4 for A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture
Viaarxiv icon