Picture for Mirco Ravanelli

Mirco Ravanelli

Comparison of Speech Tasks in Human Expert and Machine Detection of Parkinson's Disease

Add code
Oct 08, 2025
Viaarxiv icon

Investigating Faithfulness in Large Audio Language Models

Add code
Sep 26, 2025
Viaarxiv icon

FocalCodec-Stream: Streaming Low-Bitrate Speech Coding via Causal Distillation

Add code
Sep 19, 2025
Viaarxiv icon

Audio Prototypical Network For Controllable Music Recommendation

Add code
Jul 31, 2025
Viaarxiv icon

Discrete Audio Tokens: More Than a Survey!

Add code
Jun 12, 2025
Viaarxiv icon

ALAS: Measuring Latent Speech-Text Alignment For Spoken Language Understanding In Multimodal LLMs

Add code
May 26, 2025
Viaarxiv icon

LiSTEN: Learning Soft Token Embeddings for Neural Audio LLMs

Add code
May 24, 2025
Viaarxiv icon

Calm-Whisper: Reduce Whisper Hallucination On Non-Speech By Calming Crazy Heads Down

Add code
May 19, 2025
Viaarxiv icon

FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks

Add code
Feb 06, 2025
Figure 1 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Figure 2 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Figure 3 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Figure 4 for FocalCodec: Low-Bitrate Speech Coding via Focal Modulation Networks
Viaarxiv icon

Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?

Add code
Oct 08, 2024
Figure 1 for Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
Figure 2 for Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
Figure 3 for Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
Figure 4 for Adaptation Odyssey in LLMs: Why Does Additional Pretraining Sometimes Fail to Improve?
Viaarxiv icon