Abstract:Despite the advances in neural text to speech (TTS), many Arabic dialectal varieties remain marginally addressed, with most resources concentrated on Modern Spoken Arabic (MSA) and Gulf dialects, leaving Egyptian Arabic -- the most widely understood Arabic dialect -- severely under-resourced. We address this gap by introducing NileTTS: 38 hours of transcribed speech from two speakers across diverse domains including medical, sales, and general conversations. We construct this dataset using a novel synthetic pipeline: large language models (LLM) generate Egyptian Arabic content, which is then converted to natural speech using audio synthesis tools, followed by automatic transcription and speaker diarization with manual quality verification. We fine-tune XTTS v2, a state-of-the-art multilingual TTS model, on our dataset and evaluate against the baseline model trained on other Arabic dialects. Our contributions include: (1) the first publicly available Egyptian Arabic TTS dataset, (2) a reproducible synthetic data generation pipeline for dialectal TTS, and (3) an open-source fine-tuned model. All resources are released to advance Egyptian Arabic speech synthesis research.
Abstract:AI-assisted nuclei segmentation in histopathological images is a crucial task in the diagnosis and treatment of cancer diseases. It decreases the time required to manually screen microscopic tissue images and can resolve the conflict between pathologists during diagnosis. Deep Learning has proven useful in such a task. However, lack of labeled data is a significant barrier for deep learning-based approaches. In this study, we propose a novel approach to nuclei segmentation that leverages the available labelled and unlabelled data. The proposed method combines the strengths of both transductive and inductive learning, which have been previously attempted separately, into a single framework. Inductive learning aims at approximating the general function and generalizing to unseen test data, while transductive learning has the potential of leveraging the unlabelled test data to improve the classification. To the best of our knowledge, this is the first study to propose such a hybrid approach for medical image segmentation. Moreover, we propose a novel two-stage transductive inference scheme. We evaluate our approach on MoNuSeg benchmark to demonstrate the efficacy and potential of our method.