Alert button

"speech": models, code, and papers
Alert button

Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages

Add code
Bookmark button
Alert button
Jan 17, 2021
Cheng Yi, Jianzhong Wang, Ning Cheng, Shiyu Zhou, Bo Xu

Figure 1 for Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Figure 2 for Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Figure 3 for Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Figure 4 for Applying Wav2vec2.0 to Speech Recognition in Various Low-resource Languages
Viaarxiv icon

Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge

Add code
Bookmark button
Alert button
Mar 27, 2022
Sangjun Park, Kihyun Choo, Joohyung Lee, Anton V. Porov, Konstantin Osipov, June Sig Sung

Figure 1 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Figure 2 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Figure 3 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Figure 4 for Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Viaarxiv icon

TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction

Apr 19, 2021
Stanislav Beliaev, Boris Ginsburg

Figure 1 for TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
Figure 2 for TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
Figure 3 for TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
Figure 4 for TalkNet 2: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis with Explicit Pitch and Duration Prediction
Viaarxiv icon

Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis

Add code
Bookmark button
Alert button
Aug 04, 2021
Xudong Dai, Cheng Gong, Longbiao Wang, Kaili Zhang

Figure 1 for Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis
Figure 2 for Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis
Figure 3 for Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis
Figure 4 for Information Sieve: Content Leakage Reduction in End-to-End Prosody For Expressive Speech Synthesis
Viaarxiv icon

Amortized Neural Networks for Low-Latency Speech Recognition

Aug 03, 2021
Jonathan Macoskey, Grant P. Strimel, Jinru Su, Ariya Rastrow

Figure 1 for Amortized Neural Networks for Low-Latency Speech Recognition
Figure 2 for Amortized Neural Networks for Low-Latency Speech Recognition
Figure 3 for Amortized Neural Networks for Low-Latency Speech Recognition
Viaarxiv icon

Automatic Spoken Language Identification using a Time-Delay Neural Network

May 19, 2022
Benjamin Kepecs, Homayoon Beigi

Figure 1 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Figure 2 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Figure 3 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Figure 4 for Automatic Spoken Language Identification using a Time-Delay Neural Network
Viaarxiv icon

Vers la compréhension automatique de la parole bout-en-bout à moindre effort

Add code
Bookmark button
Alert button
Jul 01, 2022
Marco Naguib, François Portet, Marco Dinarelli

Figure 1 for Vers la compréhension automatique de la parole bout-en-bout à moindre effort
Figure 2 for Vers la compréhension automatique de la parole bout-en-bout à moindre effort
Figure 3 for Vers la compréhension automatique de la parole bout-en-bout à moindre effort
Figure 4 for Vers la compréhension automatique de la parole bout-en-bout à moindre effort
Viaarxiv icon

DBNet: A Dual-branch Network Architecture Processing on Spectrum and Waveform for Single-channel Speech Enhancement

May 06, 2021
Kanghao Zhang, Shulin He, Hao Li, Xueliang Zhang

Figure 1 for DBNet: A Dual-branch Network Architecture Processing on Spectrum and Waveform for Single-channel Speech Enhancement
Figure 2 for DBNet: A Dual-branch Network Architecture Processing on Spectrum and Waveform for Single-channel Speech Enhancement
Figure 3 for DBNet: A Dual-branch Network Architecture Processing on Spectrum and Waveform for Single-channel Speech Enhancement
Figure 4 for DBNet: A Dual-branch Network Architecture Processing on Spectrum and Waveform for Single-channel Speech Enhancement
Viaarxiv icon

Wav2Vec2.0 on the Edge: Performance Evaluation

Feb 12, 2022
Santosh Gondi

Figure 1 for Wav2Vec2.0 on the Edge: Performance Evaluation
Figure 2 for Wav2Vec2.0 on the Edge: Performance Evaluation
Figure 3 for Wav2Vec2.0 on the Edge: Performance Evaluation
Figure 4 for Wav2Vec2.0 on the Edge: Performance Evaluation
Viaarxiv icon

Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis

Add code
Bookmark button
Alert button
Dec 30, 2020
Jose A. Gonzalez-Lopez, Miriam Gonzalez-Atienza, Alejandro Gomez-Alanis, Jose L. Perez-Cordoba, Phil D. Green

Figure 1 for Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis
Figure 2 for Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis
Figure 3 for Multi-view Temporal Alignment for Non-parallel Articulatory-to-Acoustic Speech Synthesis
Viaarxiv icon