Picture for Marco Gaido

Marco Gaido

The Warmup Dilemma: How Learning Rate Strategies Impact Speech-to-Text Model Convergence

Add code
May 29, 2025
Viaarxiv icon

FAMA: The First Large-Scale Open-Science Speech Foundation Model for English and Italian

Add code
May 28, 2025
Viaarxiv icon

Granary: Speech Recognition and Translation Dataset in 25 European Languages

Add code
May 19, 2025
Viaarxiv icon

NUTSHELL: A Dataset for Abstract Generation from Scientific Talks

Add code
Feb 24, 2025
Viaarxiv icon

Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison

Add code
Jan 04, 2025
Viaarxiv icon

Speech Foundation Models and Crowdsourcing for Efficient, High-Quality Data Collection

Add code
Dec 16, 2024
Viaarxiv icon

SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation

Add code
Nov 03, 2024
Figure 1 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 2 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 3 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Figure 4 for SPES: Spectrogram Perturbation for Explainable Speech-to-Text Generation
Viaarxiv icon

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Add code
Oct 01, 2024
Figure 1 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 2 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 3 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 4 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Viaarxiv icon

How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not

Add code
Sep 25, 2024
Figure 1 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Figure 2 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Figure 3 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Figure 4 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Viaarxiv icon

Speech-MASSIVE: A Multilingual Speech Dataset for SLU and Beyond

Add code
Aug 07, 2024
Viaarxiv icon