Picture for Michiel Bacchiani

Michiel Bacchiani

LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus

Add code
May 30, 2023
Figure 1 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 2 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 3 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Figure 4 for LibriTTS-R: A Restored Multi-Speaker Text-to-Speech Corpus
Viaarxiv icon

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Add code
Mar 03, 2023
Figure 1 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 2 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 3 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 4 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Viaarxiv icon

WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration

Add code
Oct 03, 2022
Figure 1 for WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Figure 2 for WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Figure 3 for WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Figure 4 for WaveFit: An Iterative and Non-autoregressive Neural Vocoder based on Fixed-Point Iteration
Viaarxiv icon

SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping

Add code
Mar 31, 2022
Figure 1 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Figure 2 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Figure 3 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Figure 4 for SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping
Viaarxiv icon

Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers

Add code
Feb 16, 2022
Figure 1 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 2 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 3 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Figure 4 for Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers
Viaarxiv icon

SNRi Target Training for Joint Speech Enhancement and Recognition

Add code
Nov 01, 2021
Figure 1 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 2 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 3 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 4 for SNRi Target Training for Joint Speech Enhancement and Recognition
Viaarxiv icon

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Add code
Jun 30, 2021
Figure 1 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 2 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 3 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 4 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Viaarxiv icon

From Audio to Semantics: Approaches to end-to-end spoken language understanding

Add code
Sep 24, 2018
Figure 1 for From Audio to Semantics: Approaches to end-to-end spoken language understanding
Figure 2 for From Audio to Semantics: Approaches to end-to-end spoken language understanding
Figure 3 for From Audio to Semantics: Approaches to end-to-end spoken language understanding
Figure 4 for From Audio to Semantics: Approaches to end-to-end spoken language understanding
Viaarxiv icon

Toward domain-invariant speech recognition via large scale training

Add code
Aug 16, 2018
Figure 1 for Toward domain-invariant speech recognition via large scale training
Figure 2 for Toward domain-invariant speech recognition via large scale training
Figure 3 for Toward domain-invariant speech recognition via large scale training
Figure 4 for Toward domain-invariant speech recognition via large scale training
Viaarxiv icon

State-of-the-art Speech Recognition With Sequence-to-Sequence Models

Add code
Feb 23, 2018
Figure 1 for State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Figure 2 for State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Figure 3 for State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Figure 4 for State-of-the-art Speech Recognition With Sequence-to-Sequence Models
Viaarxiv icon