Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

Egyptian Dialect Stopword List Generation from Social Network Data

Apr 13, 2015
Walaa Medhat, Ahmed H. Yousef, Hoda Korashy

Share this with someone who'll enjoy it:

This paper proposes a methodology for generating a stopword list from online social network (OSN) corpora in Egyptian Dialect(ED). The aim of the paper is to investigate the effect of removingED stopwords on the Sentiment Analysis (SA) task. The stopwords lists generated before were on Modern Standard Arabic (MSA) which is not the common language used in OSN. We have generated a stopword list of Egyptian dialect to be used with the OSN corpora. We compare the efficiency of text classification when using the generated list along with previously generated lists of MSA and combining the Egyptian dialect list with the MSA list. The text classification was performed using Na\"ive Bayes and Decision Tree classifiers and two feature selection approaches, unigram and bigram. The experiments show that removing ED stopwords give better performance than using lists of MSA stopwords only.

* The paper is an extension to the old paper found in the language engineering conference, arXiv:1410.1135. It is accepted by the language engineeringjournal. Although it has nearly the same structure, it is different because extensive cross validation is added any many negation words are added to dataset of the paper 

   Access Paper Source

Share this with someone who'll enjoy it: