Picture for Mauro Cettolo

Mauro Cettolo

The Warmup Dilemma: How Learning Rate Strategies Impact Speech-to-Text Model Convergence

Add code
May 29, 2025
Viaarxiv icon

FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian

Add code
May 28, 2025
Viaarxiv icon

Findings of the IWSLT 2024 Evaluation Campaign

Add code
Nov 07, 2024
Viaarxiv icon

SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation

Add code
Nov 03, 2024
Figure 1 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 2 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 3 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 4 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Viaarxiv icon

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Add code
Oct 01, 2024
Figure 1 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 2 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 3 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 4 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Viaarxiv icon

SBAAM! Eliminating Transcript Dependency in Automatic Subtitling

Add code
May 17, 2024
Viaarxiv icon

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

Add code
Oct 24, 2023
Figure 1 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 2 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 3 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 4 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Viaarxiv icon

No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation

Add code
Oct 10, 2023
Figure 1 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 2 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 3 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Figure 4 for No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition through Pitch Manipulation
Viaarxiv icon

Direct Speech Translation for Automatic Subtitling

Add code
Sep 27, 2022
Figure 1 for Direct Speech Translation for Automatic Subtitling
Figure 2 for Direct Speech Translation for Automatic Subtitling
Figure 3 for Direct Speech Translation for Automatic Subtitling
Figure 4 for Direct Speech Translation for Automatic Subtitling
Viaarxiv icon

Evaluating Subtitle Segmentation for End-to-end Generation Systems

Add code
May 19, 2022
Figure 1 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Figure 2 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Figure 3 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Figure 4 for Evaluating Subtitle Segmentation for End-to-end Generation Systems
Viaarxiv icon