Picture for Jan Skoglund

Jan Skoglund

NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment

Add code
Sep 28, 2023
Figure 1 for NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment
Figure 2 for NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment
Figure 3 for NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment
Figure 4 for NOMAD: Unsupervised Learning of Perceptual Embeddings for Speech Enhancement and Non-matching Reference Audio Quality Assessment
Viaarxiv icon

LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models

Add code
Mar 23, 2023
Figure 1 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Figure 2 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Figure 3 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Figure 4 for LMCodec: A Low Bitrate Speech Codec With Causal Transformer Models
Viaarxiv icon

Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset

Add code
Sep 14, 2022
Figure 1 for Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Figure 2 for Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Figure 3 for Using Rater and System Metadata to Explain Variance in the VoiceMOS Challenge 2022 Dataset
Viaarxiv icon

Ultra-Low-Bitrate Speech Coding with Pretrained Transformers

Add code
Jul 05, 2022
Figure 1 for Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Figure 2 for Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Figure 3 for Ultra-Low-Bitrate Speech Coding with Pretrained Transformers
Viaarxiv icon

A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality

Add code
Apr 05, 2022
Figure 1 for A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality
Figure 2 for A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality
Figure 3 for A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality
Figure 4 for A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality
Viaarxiv icon

SoundStream: An End-to-End Neural Audio Codec

Add code
Jul 07, 2021
Figure 1 for SoundStream: An End-to-End Neural Audio Codec
Figure 2 for SoundStream: An End-to-End Neural Audio Codec
Figure 3 for SoundStream: An End-to-End Neural Audio Codec
Figure 4 for SoundStream: An End-to-End Neural Audio Codec
Viaarxiv icon

Handling Background Noise in Neural Speech Generation

Add code
Feb 23, 2021
Figure 1 for Handling Background Noise in Neural Speech Generation
Figure 2 for Handling Background Noise in Neural Speech Generation
Figure 3 for Handling Background Noise in Neural Speech Generation
Figure 4 for Handling Background Noise in Neural Speech Generation
Viaarxiv icon

WARP-Q: Quality Prediction For Generative Neural Speech Codecs

Add code
Feb 20, 2021
Figure 1 for WARP-Q: Quality Prediction For Generative Neural Speech Codecs
Figure 2 for WARP-Q: Quality Prediction For Generative Neural Speech Codecs
Figure 3 for WARP-Q: Quality Prediction For Generative Neural Speech Codecs
Figure 4 for WARP-Q: Quality Prediction For Generative Neural Speech Codecs
Viaarxiv icon

Generative Speech Coding with Predictive Variance Regularization

Add code
Feb 18, 2021
Figure 1 for Generative Speech Coding with Predictive Variance Regularization
Figure 2 for Generative Speech Coding with Predictive Variance Regularization
Figure 3 for Generative Speech Coding with Predictive Variance Regularization
Viaarxiv icon

A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet

Add code
Mar 28, 2019
Figure 1 for A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
Figure 2 for A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
Figure 3 for A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
Figure 4 for A Real-Time Wideband Neural Vocoder at 1.6 kb/s Using LPCNet
Viaarxiv icon