Picture for Florian Metze

Florian Metze

Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage

Add code
Oct 02, 2025
Viaarxiv icon

Multi-Channel Differential ASR for Robust Wearer Speech Recognition on Smart Glasses

Add code
Sep 17, 2025
Viaarxiv icon

Thinking in Directivity: Speech Large Language Model for Multi-Talker Directional Speech Recognition

Add code
Jun 17, 2025
Viaarxiv icon

MASV: Speaker Verification with Global and Local Context Mamba

Add code
Dec 14, 2024
Figure 1 for MASV: Speaker Verification with Global and Local Context Mamba
Figure 2 for MASV: Speaker Verification with Global and Local Context Mamba
Figure 3 for MASV: Speaker Verification with Global and Local Context Mamba
Figure 4 for MASV: Speaker Verification with Global and Local Context Mamba
Viaarxiv icon

Error-aware Quantization through Noise Tempering

Add code
Dec 11, 2022
Viaarxiv icon

Normalized Contrastive Learning for Text-Video Retrieval

Add code
Nov 30, 2022
Figure 1 for Normalized Contrastive Learning for Text-Video Retrieval
Figure 2 for Normalized Contrastive Learning for Text-Video Retrieval
Figure 3 for Normalized Contrastive Learning for Text-Video Retrieval
Figure 4 for Normalized Contrastive Learning for Text-Video Retrieval
Viaarxiv icon

Token-level Sequence Labeling for Spoken Language Understanding using Compositional End-to-End Models

Add code
Oct 27, 2022
Viaarxiv icon

SQuAT: Sharpness- and Quantization-Aware Training for BERT

Add code
Oct 13, 2022
Figure 1 for SQuAT: Sharpness- and Quantization-Aware Training for BERT
Figure 2 for SQuAT: Sharpness- and Quantization-Aware Training for BERT
Figure 3 for SQuAT: Sharpness- and Quantization-Aware Training for BERT
Figure 4 for SQuAT: Sharpness- and Quantization-Aware Training for BERT
Viaarxiv icon

CTC Alignments Improve Autoregressive Translation

Add code
Oct 11, 2022
Figure 1 for CTC Alignments Improve Autoregressive Translation
Figure 2 for CTC Alignments Improve Autoregressive Translation
Figure 3 for CTC Alignments Improve Autoregressive Translation
Figure 4 for CTC Alignments Improve Autoregressive Translation
Viaarxiv icon

ASR2K: Speech Recognition for Around 2000 Languages without Audio

Add code
Sep 06, 2022
Figure 1 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Figure 2 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Figure 3 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Figure 4 for ASR2K: Speech Recognition for Around 2000 Languages without Audio
Viaarxiv icon