speech


BUT Systems for Environmental Sound Deepfake Detection in the ESDD 2026 Challenge

Add code
Dec 09, 2025
Viaarxiv icon

LG Uplus System with Multi-Speaker IDs and Discriminator-based Sub-Judges for the WildSpoof Challenge

Add code
Dec 09, 2025
Viaarxiv icon

Efficient ASR for Low-Resource Languages: Leveraging Cross-Lingual Unlabeled Data

Add code
Dec 08, 2025
Viaarxiv icon

TeluguST-46: A Benchmark Corpus and Comprehensive Evaluation for Telugu-English Speech Translation

Add code
Dec 08, 2025
Figure 1 for TeluguST-46: A Benchmark Corpus and Comprehensive Evaluation for Telugu-English Speech Translation
Figure 2 for TeluguST-46: A Benchmark Corpus and Comprehensive Evaluation for Telugu-English Speech Translation
Figure 3 for TeluguST-46: A Benchmark Corpus and Comprehensive Evaluation for Telugu-English Speech Translation
Viaarxiv icon

Beyond Unified Models: A Service-Oriented Approach to Low Latency, Context Aware Phonemization for Real Time TTS

Add code
Dec 08, 2025
Viaarxiv icon

A Simple Method to Enhance Pre-trained Language Models with Speech Tokens for Classification

Add code
Dec 08, 2025
Viaarxiv icon

A multimodal Bayesian Network for symptom-level depression and anxiety prediction from voice and speech data

Add code
Dec 08, 2025
Viaarxiv icon

Scriboora: Rethinking Human Pose Forecasting

Add code
Nov 19, 2025
Figure 1 for Scriboora: Rethinking Human Pose Forecasting
Figure 2 for Scriboora: Rethinking Human Pose Forecasting
Figure 3 for Scriboora: Rethinking Human Pose Forecasting
Figure 4 for Scriboora: Rethinking Human Pose Forecasting
Viaarxiv icon

PresentCoach: Dual-Agent Presentation Coaching through Exemplars and Interactive Feedback

Add code
Nov 19, 2025
Viaarxiv icon

Auden-Voice: General-Purpose Voice Encoder for Speech and Language Understanding

Add code
Nov 19, 2025
Viaarxiv icon