speech


Pseudo2Real: Task Arithmetic for Pseudo-Label Correction in Automatic Speech Recognition

Add code
Oct 09, 2025
Viaarxiv icon

VoiceAgentBench: Are Voice Assistants ready for agentic tasks?

Add code
Oct 09, 2025
Viaarxiv icon

Standard-to-Dialect Transfer Trends Differ across Text and Speech: A Case Study on Intent and Topic Classification in German Dialects

Add code
Oct 09, 2025
Figure 1 for Standard-to-Dialect Transfer Trends Differ across Text and Speech: A Case Study on Intent and Topic Classification in German Dialects
Figure 2 for Standard-to-Dialect Transfer Trends Differ across Text and Speech: A Case Study on Intent and Topic Classification in German Dialects
Figure 3 for Standard-to-Dialect Transfer Trends Differ across Text and Speech: A Case Study on Intent and Topic Classification in German Dialects
Figure 4 for Standard-to-Dialect Transfer Trends Differ across Text and Speech: A Case Study on Intent and Topic Classification in German Dialects
Viaarxiv icon

CS3-Bench: Evaluating and Enhancing Speech-to-Speech LLMs for Mandarin-English Code-Switching

Add code
Oct 09, 2025
Viaarxiv icon

Full-Duplex-Bench-v2: A Multi-Turn Evaluation Framework for Duplex Dialogue Systems with an Automated Examiner

Add code
Oct 09, 2025
Viaarxiv icon

Causality Guided Representation Learning for Cross-Style Hate Speech Detection

Add code
Oct 09, 2025
Figure 1 for Causality Guided Representation Learning for Cross-Style Hate Speech Detection
Figure 2 for Causality Guided Representation Learning for Cross-Style Hate Speech Detection
Figure 3 for Causality Guided Representation Learning for Cross-Style Hate Speech Detection
Figure 4 for Causality Guided Representation Learning for Cross-Style Hate Speech Detection
Viaarxiv icon

Bloodroot: When Watermarking Turns Poisonous For Stealthy Backdoor

Add code
Oct 09, 2025
Viaarxiv icon

IsoSignVid2Aud: Sign Language Video to Audio Conversion without Text Intermediaries

Add code
Oct 09, 2025
Viaarxiv icon

How much speech data is necessary for ASR in African languages? An evaluation of data scaling in Kinyarwanda and Kikuyu

Add code
Oct 08, 2025
Viaarxiv icon

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Add code
Oct 08, 2025
Viaarxiv icon