Picture for Gerard I. Gállego

Gerard I. Gállego

Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs

Add code
Dec 24, 2025
Figure 1 for Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs
Figure 2 for Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs
Figure 3 for Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs
Figure 4 for Hearing to Translate: The Effectiveness of Speech Modality Integration into LLMs
Viaarxiv icon

Speech-to-Text Translation with Phoneme-Augmented CoT: Enhancing Cross-Lingual Transfer in Low-Resource Scenarios

Add code
May 30, 2025
Viaarxiv icon

Unveiling the Role of Pretraining in Direct Speech Translation

Add code
Sep 26, 2024
Figure 1 for Unveiling the Role of Pretraining in Direct Speech Translation
Figure 2 for Unveiling the Role of Pretraining in Direct Speech Translation
Figure 3 for Unveiling the Role of Pretraining in Direct Speech Translation
Figure 4 for Unveiling the Role of Pretraining in Direct Speech Translation
Viaarxiv icon

Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation

Add code
Sep 17, 2024
Figure 1 for Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation
Figure 2 for Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation
Figure 3 for Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation
Figure 4 for Single-stage TTS with Masked Audio Token Modeling and Semantic Knowledge Distillation
Viaarxiv icon

Pushing the Limits of Zero-shot End-to-End Speech Translation

Add code
Feb 16, 2024
Viaarxiv icon

SpeechAlign: a Framework for Speech Translation Alignment Evaluation

Add code
Sep 20, 2023
Figure 1 for SpeechAlign: a Framework for Speech Translation Alignment Evaluation
Figure 2 for SpeechAlign: a Framework for Speech Translation Alignment Evaluation
Figure 3 for SpeechAlign: a Framework for Speech Translation Alignment Evaluation
Figure 4 for SpeechAlign: a Framework for Speech Translation Alignment Evaluation
Viaarxiv icon

Speech Translation with Foundation Models and Optimal Transport: UPC at IWSLT23

Add code
Jun 02, 2023
Viaarxiv icon

Explaining How Transformers Use Context to Build Predictions

Add code
May 21, 2023
Figure 1 for Explaining How Transformers Use Context to Build Predictions
Figure 2 for Explaining How Transformers Use Context to Build Predictions
Figure 3 for Explaining How Transformers Use Context to Build Predictions
Figure 4 for Explaining How Transformers Use Context to Build Predictions
Viaarxiv icon

Sign Language Translation from Instructional Videos

Add code
Apr 14, 2023
Figure 1 for Sign Language Translation from Instructional Videos
Figure 2 for Sign Language Translation from Instructional Videos
Figure 3 for Sign Language Translation from Instructional Videos
Figure 4 for Sign Language Translation from Instructional Videos
Viaarxiv icon

Efficient Speech Translation with Dynamic Latent Perceivers

Add code
Oct 28, 2022
Viaarxiv icon