Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox


Diacritization of Maghrebi Arabic Sub-Dialects

Oct 29, 2018
Ahmed Abdelali, Mohammed Attia, Younes Samihy, Kareem Darwish, Hamdy Mubarak


Share this with someone who'll enjoy it:


Diacritization process attempt to restore the short vowels in Arabic written text; which typically are omitted. This process is essential for applications such as Text-to-Speech (TTS). While diacritization of Modern Standard Arabic (MSA) still holds the lion share, research on dialectal Arabic (DA) diacritization is very limited. In this paper, we present our contribution and results on the automatic diacritization of two sub-dialects of Maghrebi Arabic, namely Tunisian and Moroccan, using a character-level deep neural network architecture that stacks two bi-LSTM layers over a CRF output layer. The model achieves word error rate of 2.7% and 3.6% for Moroccan and Tunisian respectively and is capable of implicitly identifying the sub-dialect of the input.

* 6 pages, 3 figures 


   Access Paper Source



Share this with someone who'll enjoy it: