Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Reza Khodadadi

PSRB: A Comprehensive Benchmark for Evaluating Persian ASR Systems

May 27, 2025

Nima Sedghiyeh, Sara Sadeghi, Reza Khodadadi, Farzin Kashani, Omid Aghdaei, Somayeh Rahimi, Mohammad Sadegh Safari

Abstract:Although Automatic Speech Recognition (ASR) systems have become an integral part of modern technology, their evaluation remains challenging, particularly for low-resource languages such as Persian. This paper introduces Persian Speech Recognition Benchmark(PSRB), a comprehensive benchmark designed to address this gap by incorporating diverse linguistic and acoustic conditions. We evaluate ten ASR systems, including state-of-the-art commercial and open-source models, to examine performance variations and inherent biases. Additionally, we conduct an in-depth analysis of Persian ASR transcriptions, identifying key error types and proposing a novel metric that weights substitution errors. This metric enhances evaluation robustness by reducing the impact of minor and partial errors, thereby improving the precision of performance assessment. Our findings indicate that while ASR models generally perform well on standard Persian, they struggle with regional accents, children's speech, and specific linguistic challenges. These results highlight the necessity of fine-tuning and incorporating diverse, representative training datasets to mitigate biases and enhance overall ASR performance. PSRB provides a valuable resource for advancing ASR research in Persian and serves as a framework for developing benchmarks in other low-resource languages. A subset of the PSRB dataset is publicly available at https://huggingface.co/datasets/PartAI/PSRB.

* 25 pages, 7 figures

Via

Access Paper or Ask Questions

The SVASR System for Text-dependent Speaker Verification (TdSV) AAIC Challenge 2024

Nov 25, 2024

Mohammadreza Molavi, Reza Khodadadi

Figure 1 for The SVASR System for Text-dependent Speaker Verification (TdSV) AAIC Challenge 2024

Figure 2 for The SVASR System for Text-dependent Speaker Verification (TdSV) AAIC Challenge 2024

Figure 3 for The SVASR System for Text-dependent Speaker Verification (TdSV) AAIC Challenge 2024

Figure 4 for The SVASR System for Text-dependent Speaker Verification (TdSV) AAIC Challenge 2024

Abstract:This paper introduces an efficient and accurate pipeline for text-dependent speaker verification (TDSV), designed to address the need for high-performance biometric systems. The proposed system incorporates a Fast-Conformer-based ASR module to validate speech content, filtering out Target-Wrong (TW) and Impostor-Wrong (IW) trials. For speaker verification, we propose a feature fusion approach that combines speaker embeddings extracted from wav2vec-BERT and ReDimNet models to create a unified speaker representation. This system achieves competitive results on the TDSV 2024 Challenge test set, with a normalized min-DCF of 0.0452 (rank 2), highlighting its effectiveness in balancing accuracy and robustness.

Via

Access Paper or Ask Questions