Picture for Samuel Thomas

Samuel Thomas

Self-Speculative Decoding for LLM-based ASR with CTC Encoder Drafts

Add code
Mar 11, 2026
Viaarxiv icon

NLE: Non-autoregressive LLM-based ASR by Transcript Editing

Add code
Mar 09, 2026
Viaarxiv icon

Granite-speech: open-source speech-aware LLMs with strong English ASR capabilities

Add code
May 14, 2025
Viaarxiv icon

Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?

Add code
May 14, 2025
Figure 1 for Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
Figure 2 for Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?
Viaarxiv icon

CAV-MAE Sync: Improving Contrastive Audio-Visual Mask Autoencoders via Fine-Grained Alignment

Add code
May 02, 2025
Viaarxiv icon

mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition

Add code
Feb 03, 2025
Viaarxiv icon

A Non-autoregressive Model for Joint STT and TTS

Add code
Jan 15, 2025
Viaarxiv icon

Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

Add code
Jun 14, 2024
Figure 1 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Figure 2 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Figure 3 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Figure 4 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Viaarxiv icon

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Add code
May 21, 2023
Figure 1 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 2 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 3 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 4 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Viaarxiv icon

FisHook -- An Optimized Approach to Marine Specie Classification using MobileNetV2

Add code
Apr 04, 2023
Figure 1 for FisHook -- An Optimized Approach to Marine Specie Classification using MobileNetV2
Figure 2 for FisHook -- An Optimized Approach to Marine Specie Classification using MobileNetV2
Figure 3 for FisHook -- An Optimized Approach to Marine Specie Classification using MobileNetV2
Figure 4 for FisHook -- An Optimized Approach to Marine Specie Classification using MobileNetV2
Viaarxiv icon