Picture for Maja Pantic

Maja Pantic

RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement

Add code
Jul 10, 2024
Figure 1 for RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement
Figure 2 for RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement
Figure 3 for RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement
Figure 4 for RT-LA-VocE: Real-Time Low-SNR Audio-Visual Speech Enhancement
Viaarxiv icon

Dynamic Data Pruning for Automatic Speech Recognition

Add code
Jun 26, 2024
Figure 1 for Dynamic Data Pruning for Automatic Speech Recognition
Figure 2 for Dynamic Data Pruning for Automatic Speech Recognition
Figure 3 for Dynamic Data Pruning for Automatic Speech Recognition
Figure 4 for Dynamic Data Pruning for Automatic Speech Recognition
Viaarxiv icon

MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization

Add code
Jun 25, 2024
Figure 1 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 2 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 3 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Figure 4 for MSRS: Training Multimodal Speech Recognition Models from Scratch with Sparse Mask Optimization
Viaarxiv icon

EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Add code
Apr 29, 2024
Figure 1 for EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Figure 2 for EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Figure 3 for EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Figure 4 for EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars
Viaarxiv icon

BRAVEn: Improving Self-Supervised Pre-training for Visual and Auditory Speech Recognition

Add code
Apr 02, 2024
Viaarxiv icon

Audio-visual video-to-speech synthesis with synthesized input audio

Add code
Jul 31, 2023
Viaarxiv icon

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition

Add code
Jul 10, 2023
Figure 1 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 2 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 3 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 4 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Viaarxiv icon

Large-scale unsupervised audio pre-training for video-to-speech synthesis

Add code
Jun 27, 2023
Figure 1 for Large-scale unsupervised audio pre-training for video-to-speech synthesis
Figure 2 for Large-scale unsupervised audio pre-training for video-to-speech synthesis
Figure 3 for Large-scale unsupervised audio pre-training for video-to-speech synthesis
Figure 4 for Large-scale unsupervised audio pre-training for video-to-speech synthesis
Viaarxiv icon

Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models

Add code
May 15, 2023
Figure 1 for Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Figure 2 for Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Figure 3 for Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Figure 4 for Laughing Matters: Introducing Laughing-Face Generation using Diffusion Models
Viaarxiv icon

SynthVSR: Scaling Up Visual Speech Recognition With Synthetic Supervision

Add code
Apr 03, 2023
Viaarxiv icon