Picture for Thomas Hueber

Thomas Hueber

GIPSA-CRISSP

MauBERT: Universal Phonetic Inductive Biases for Few-Shot Acoustic Units Discovery

Add code
Dec 22, 2025
Viaarxiv icon

Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model

Add code
Jan 08, 2025
Figure 1 for Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model
Figure 2 for Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model
Figure 3 for Cued Speech Generation Leveraging a Pre-trained Audiovisual Text-to-Speech Model
Viaarxiv icon

Simulating Articulatory Trajectories with Phonological Feature Interpolation

Add code
Aug 08, 2024
Figure 1 for Simulating Articulatory Trajectories with Phonological Feature Interpolation
Figure 2 for Simulating Articulatory Trajectories with Phonological Feature Interpolation
Figure 3 for Simulating Articulatory Trajectories with Phonological Feature Interpolation
Figure 4 for Simulating Articulatory Trajectories with Phonological Feature Interpolation
Viaarxiv icon

Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting

Add code
May 30, 2024
Figure 1 for Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting
Figure 2 for Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting
Figure 3 for Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting
Figure 4 for Fill in the Gap! Combining Self-supervised Representation Learning with Neural Audio Synthesis for Speech Inpainting
Viaarxiv icon

Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding

Add code
Jun 14, 2023
Figure 1 for Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding
Figure 2 for Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding
Figure 3 for Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding
Figure 4 for Investigating the dynamics of hand and lips in French Cued Speech using attention mechanisms and CTC-based decoding
Viaarxiv icon

BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model

Add code
Jul 04, 2022
Figure 1 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Figure 2 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Figure 3 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Figure 4 for BERT, can HE predict contrastive focus? Predicting and controlling prominence in neural TTS using a language model
Viaarxiv icon

Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE

Add code
Jun 17, 2022
Figure 1 for Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE
Figure 2 for Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE
Figure 3 for Self-supervised speech unit discovery from articulatory and acoustic features using VQ-VAE
Viaarxiv icon

Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding

Add code
Apr 11, 2022
Figure 1 for Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
Figure 2 for Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
Figure 3 for Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
Viaarxiv icon

Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation

Add code
Apr 05, 2022
Figure 1 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Figure 2 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Figure 3 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Figure 4 for Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation
Viaarxiv icon

A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling

Add code
Jun 14, 2021
Figure 1 for A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling
Figure 2 for A Benchmark of Dynamical Variational Autoencoders applied to Speech Spectrogram Modeling
Viaarxiv icon