Simultaneous Speech To Text Translation


RASST: Fast Cross-modal Retrieval-Augmented Simultaneous Speech Translation

Add code
Jan 30, 2026
Viaarxiv icon

Corpus of Cross-lingual Dialogues with Minutes and Detection of Misunderstandings

Add code
Dec 23, 2025
Viaarxiv icon

Direct Simultaneous Translation Activation for Large Audio-Language Models

Add code
Sep 19, 2025
Figure 1 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 2 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 3 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Figure 4 for Direct Simultaneous Translation Activation for Large Audio-Language Models
Viaarxiv icon

REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation

Add code
Aug 07, 2025
Figure 1 for REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation
Figure 2 for REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation
Figure 3 for REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation
Figure 4 for REINA: Regularized Entropy Information-Based Loss for Efficient Simultaneous Speech Translation
Viaarxiv icon

CMU's IWSLT 2025 Simultaneous Speech Translation System

Add code
Jun 16, 2025
Viaarxiv icon

BeaverTalk: Oregon State University's IWSLT 2025 Simultaneous Speech Translation System

Add code
May 29, 2025
Viaarxiv icon

MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines

Add code
Jun 05, 2025
Figure 1 for MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines
Figure 2 for MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines
Figure 3 for MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines
Figure 4 for MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines
Viaarxiv icon

Using Phonemes in cascaded S2S translation pipeline

Add code
Apr 22, 2025
Viaarxiv icon

SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation

Add code
Apr 22, 2025
Figure 1 for SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
Figure 2 for SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
Figure 3 for SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
Figure 4 for SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation
Viaarxiv icon

Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation

Add code
May 21, 2025
Figure 1 for Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation
Figure 2 for Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation
Figure 3 for Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation
Figure 4 for Leveraging Unit Language Guidance to Advance Speech Modeling in Textless Speech-to-Speech Translation
Viaarxiv icon