Picture for Umberto Cappellazzo

Umberto Cappellazzo

Omni-AVSR: Towards Unified Multimodal Speech Recognition with Large Language Models

Add code
Nov 10, 2025
Viaarxiv icon

Mitigating Attention Sinks and Massive Activations in Audio-Visual Speech Recognition with LLMS

Add code
Oct 26, 2025
Viaarxiv icon

Scaling and Enhancing LLM-based AVSR: A Sparse Mixture of Projectors Approach

Add code
May 21, 2025
Viaarxiv icon

Adaptive Audio-Visual Speech Recognition via Matryoshka-Based Multimodal LLMs

Add code
Mar 09, 2025
Viaarxiv icon

Evaluating and Improving Continual Learning in Spoken Language Understanding

Add code
Feb 16, 2024
Viaarxiv icon

Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters

Add code
Feb 01, 2024
Viaarxiv icon

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

Add code
Dec 07, 2023
Viaarxiv icon

Continual Contrastive Spoken Language Understanding

Add code
Oct 04, 2023
Figure 1 for Continual Contrastive Spoken Language Understanding
Figure 2 for Continual Contrastive Spoken Language Understanding
Figure 3 for Continual Contrastive Spoken Language Understanding
Figure 4 for Continual Contrastive Spoken Language Understanding
Viaarxiv icon

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

Add code
Sep 18, 2023
Figure 1 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 2 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 3 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 4 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Viaarxiv icon

Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding

Add code
May 23, 2023
Viaarxiv icon