Picture for Mauro Cettolo

Mauro Cettolo

The Warmup Dilemma: How Learning Rate Strategies Impact Speech-to-Text Model Convergence

Add code
May 29, 2025
Viaarxiv icon

FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian

Add code
May 28, 2025
Viaarxiv icon

Findings of the IWSLT 2024 Evaluation Campaign

Add code
Nov 07, 2024
Viaarxiv icon

SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation

Add code
Nov 03, 2024
Figure 1 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 2 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 3 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 4 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Viaarxiv icon

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Add code
Oct 01, 2024
Figure 1 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 2 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 3 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 4 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Viaarxiv icon

SBAAM! Eliminating Transcript Dependency in Automatic Subtitling

Add code
May 17, 2024
Figure 1 for SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
Figure 2 for SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
Figure 3 for SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
Figure 4 for SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
Viaarxiv icon

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

Add code
Oct 24, 2023
Figure 1 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 2 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 3 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 4 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Viaarxiv icon

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

Add code
Oct 10, 2023
Figure 1 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 2 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 3 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 4 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Viaarxiv icon

Direct Speech Translation for Automatic Subtitling

Add code
Sep 27, 2022
Figure 1 for Direct Speech Translation for Automatic Subtitling
Figure 2 for Direct Speech Translation for Automatic Subtitling
Figure 3 for Direct Speech Translation for Automatic Subtitling
Figure 4 for Direct Speech Translation for Automatic Subtitling
Viaarxiv icon

Evaluating Subtitle Segmentation for End-to-end Generation Systems

Add code
May 19, 2022
Figure 1 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Figure 2 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Figure 3 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Figure 4 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Viaarxiv icon