Picture for Stavros Petridis

Stavros Petridis

Revival with Voice: Multi-modal Controllable Text-to-Speech Synthesis

Add code
May 25, 2025
Viaarxiv icon

Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach

Add code
May 21, 2025
Viaarxiv icon

FaceCrafter: Identity-Conditional Diffusion with Disentangled Control over Facial Pose, Expression, and Emotion

Add code
May 21, 2025
Viaarxiv icon

KeySync: A Robust Approach for Leakage-free Lip Synchronization in High Resolution

Add code
May 01, 2025
Viaarxiv icon

Contextual Speech Extraction: Leveraging Textual History as an Implicit Cue for Target Speech Extraction

Add code
Mar 11, 2025
Viaarxiv icon

Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs

Add code
Mar 09, 2025
Viaarxiv icon

Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations

Add code
Mar 08, 2025
Viaarxiv icon

KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation

Add code
Mar 03, 2025
Viaarxiv icon

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs

Add code
Nov 04, 2024
Figure 1 for Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
Figure 2 for Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
Figure 3 for Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
Figure 4 for Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
Viaarxiv icon

Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models

Add code
Oct 10, 2024
Figure 1 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 2 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 3 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Figure 4 for Full-Rank No More: Low-Rank Weight Training for Modern Speech Recognition Models
Viaarxiv icon