Serial Speakers


Modeling Overlapped Speech with Shuffles

Add code
Mar 18, 2026
Viaarxiv icon

End-to-End Joint ASR and Speaker Role Diarization with Child-Adult Interactions

Add code
Jan 25, 2026
Viaarxiv icon

CodeSep: Low-Bitrate Codec-Driven Speech Separation with Base-Token Disentanglement and Auxiliary-Token Serial Prediction

Add code
Jan 19, 2026
Viaarxiv icon

LTS-VoiceAgent: A Listen-Think-Speak Framework for Efficient Streaming Voice Interaction via Semantic Triggering and Incremental Reasoning

Add code
Jan 26, 2026
Viaarxiv icon

TagSpeech: End-to-End Multi-Speaker ASR and Diarization with Fine-Grained Temporal Grounding

Add code
Jan 11, 2026
Viaarxiv icon

Joint ASR and Speaker Role Tagging with Serialized Output Training

Add code
Jun 12, 2025
Figure 1 for Joint ASR and Speaker Role Tagging with Serialized Output Training
Figure 2 for Joint ASR and Speaker Role Tagging with Serialized Output Training
Figure 3 for Joint ASR and Speaker Role Tagging with Serialized Output Training
Figure 4 for Joint ASR and Speaker Role Tagging with Serialized Output Training
Viaarxiv icon

SC-SOT: Conditioning the Decoder on Diarized Speaker Information for End-to-End Overlapped Speech Recognition

Add code
Jun 15, 2025
Viaarxiv icon

Speaker-Distinguishable CTC: Learning Speaker Distinction Using CTC for Multi-Talker Speech Recognition

Add code
Jun 09, 2025
Viaarxiv icon

Diarization-Aware Multi-Speaker Automatic Speech Recognition via Large Language Models

Add code
Jun 06, 2025
Viaarxiv icon

Improving Practical Aspects of End-to-End Multi-Talker Speech Recognition for Online and Offline Scenarios

Add code
Jun 17, 2025
Viaarxiv icon