speech


Towards Leveraging Sequential Structure in Animal Vocalizations

Add code
Nov 13, 2025
Viaarxiv icon

MINDS: A Cross-cultural Dialogue Corpus for Social Norm Classification and Adherence Detection

Add code
Nov 13, 2025
Figure 1 for MINDS: A Cross-cultural Dialogue Corpus for Social Norm Classification and Adherence Detection
Figure 2 for MINDS: A Cross-cultural Dialogue Corpus for Social Norm Classification and Adherence Detection
Figure 3 for MINDS: A Cross-cultural Dialogue Corpus for Social Norm Classification and Adherence Detection
Figure 4 for MINDS: A Cross-cultural Dialogue Corpus for Social Norm Classification and Adherence Detection
Viaarxiv icon

End-to-end Contrastive Language-Speech Pretraining Model For Long-form Spoken Question Answering

Add code
Nov 12, 2025
Viaarxiv icon

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Add code
Nov 12, 2025
Figure 1 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Figure 2 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Figure 3 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Figure 4 for Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Viaarxiv icon

MERaLiON-SER: Robust Speech Emotion Recognition Model for English and SEA Languages

Add code
Nov 12, 2025
Viaarxiv icon

Back to the Future: The Role of Past and Future Context Predictability in Incremental Language Production

Add code
Nov 12, 2025
Viaarxiv icon

Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification

Add code
Nov 12, 2025
Figure 1 for Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification
Figure 2 for Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification
Figure 3 for Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification
Figure 4 for Tighter Truncated Rectangular Prism Approximation for RNN Robustness Verification
Viaarxiv icon

POTSA: A Cross-Lingual Speech Alignment Framework for Low Resource Speech-to-Text Translation

Add code
Nov 12, 2025
Viaarxiv icon

Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues

Add code
Nov 12, 2025
Figure 1 for Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues
Figure 2 for Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues
Figure 3 for Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues
Figure 4 for Beyond saliency: enhancing explanation of speech emotion recognition with expert-referenced acoustic cues
Viaarxiv icon

Context-Aware Dynamic Chunking for Streaming Tibetan Speech Recognition

Add code
Nov 12, 2025
Viaarxiv icon