Picture for Herman Kamper

Herman Kamper

Voice Conversion With Just Nearest Neighbors

Add code
May 30, 2023
Figure 1 for Voice Conversion With Just Nearest Neighbors
Figure 2 for Voice Conversion With Just Nearest Neighbors
Figure 3 for Voice Conversion With Just Nearest Neighbors
Viaarxiv icon

Visually grounded few-shot word acquisition with fewer shots

Add code
May 25, 2023
Figure 1 for Visually grounded few-shot word acquisition with fewer shots
Figure 2 for Visually grounded few-shot word acquisition with fewer shots
Figure 3 for Visually grounded few-shot word acquisition with fewer shots
Figure 4 for Visually grounded few-shot word acquisition with fewer shots
Viaarxiv icon

Mitigating Catastrophic Forgetting for Few-Shot Spoken Word Classification Through Meta-Learning

Add code
May 22, 2023
Viaarxiv icon

TransFusion: Transcribing Speech with Multinomial Diffusion

Add code
Oct 14, 2022
Figure 1 for TransFusion: Transcribing Speech with Multinomial Diffusion
Figure 2 for TransFusion: Transcribing Speech with Multinomial Diffusion
Figure 3 for TransFusion: Transcribing Speech with Multinomial Diffusion
Figure 4 for TransFusion: Transcribing Speech with Multinomial Diffusion
Viaarxiv icon

YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding

Add code
Oct 12, 2022
Figure 1 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Figure 2 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Figure 3 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Figure 4 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Viaarxiv icon

Towards visually prompted keyword localisation for zero-resource spoken languages

Add code
Oct 12, 2022
Figure 1 for Towards visually prompted keyword localisation for zero-resource spoken languages
Figure 2 for Towards visually prompted keyword localisation for zero-resource spoken languages
Figure 3 for Towards visually prompted keyword localisation for zero-resource spoken languages
Figure 4 for Towards visually prompted keyword localisation for zero-resource spoken languages
Viaarxiv icon

GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models

Add code
Oct 11, 2022
Figure 1 for GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models
Figure 2 for GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models
Figure 3 for GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models
Figure 4 for GAN You Hear Me? Reclaiming Unconditional Speech Synthesis from Diffusion Models
Viaarxiv icon

A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery

Add code
Jun 29, 2022
Figure 1 for A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery
Figure 2 for A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery
Figure 3 for A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery
Figure 4 for A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery
Viaarxiv icon

Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring

Add code
Feb 24, 2022
Figure 1 for Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
Figure 2 for Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
Figure 3 for Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
Figure 4 for Word Segmentation on Discovered Phone Units with Dynamic Programming and Self-Supervised Scoring
Viaarxiv icon

Keyword localisation in untranscribed speech using visually grounded speech models

Add code
Feb 02, 2022
Figure 1 for Keyword localisation in untranscribed speech using visually grounded speech models
Figure 2 for Keyword localisation in untranscribed speech using visually grounded speech models
Figure 3 for Keyword localisation in untranscribed speech using visually grounded speech models
Figure 4 for Keyword localisation in untranscribed speech using visually grounded speech models
Viaarxiv icon