Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fatma El-Ghannam

A Lemma Based Evaluator for Semitic Language Text Summarization Systems

Mar 22, 2014

Tarek El-Shishtawy, Fatma El-Ghannam

Figure 1 for A Lemma Based Evaluator for Semitic Language Text Summarization Systems

Abstract:Matching texts in highly inflected languages such as Arabic by simple stemming strategy is unlikely to perform well. In this paper, we present a strategy for automatic text matching technique for for inflectional languages, using Arabic as the test case. The system is an extension of ROUGE test in which texts are matched on token's lemma level. The experimental results show an enhancement of detecting similarities between different sentences having same semantics but written in different lexical forms..

Via

Access Paper or Ask Questions

Multi-Topic Multi-Document Summarizer

Jan 03, 2014

Fatma El-Ghannam, Tarek El-Shishtawy

Figure 1 for Multi-Topic Multi-Document Summarizer

Figure 2 for Multi-Topic Multi-Document Summarizer

Figure 3 for Multi-Topic Multi-Document Summarizer

Figure 4 for Multi-Topic Multi-Document Summarizer

Abstract:Current multi-document summarization systems can successfully extract summary sentences, however with many limitations including: low coverage, inaccurate extraction to important sentences, redundancy and poor coherence among the selected sentences. The present study introduces a new concept of centroid approach and reports new techniques for extracting summary sentences for multi-document. In both techniques keyphrases are used to weigh sentences and documents. The first summarization technique (Sen-Rich) prefers maximum richness sentences. While the second (Doc-Rich), prefers sentences from centroid document. To demonstrate the new summarization system application to extract summaries of Arabic documents we performed two experiments. First, we applied Rouge measure to compare the new techniques among systems presented at TAC2011. The results show that Sen-Rich outperformed all systems in ROUGE-S. Second, the system was applied to summarize multi-topic documents. Using human evaluators, the results show that Doc-Rich is the superior, where summary sentences characterized by extra coverage and more cohesion.

* International Journal of Computer Science & Information Technology (IJCSIT) Vol 5, No 6, December 2013

Via

Access Paper or Ask Questions

Keyphrase Based Arabic Summarizer (KPAS)

Jun 23, 2012

Tarek El-Shishtawy, Fatma El-Ghannam

Figure 1 for Keyphrase Based Arabic Summarizer (KPAS)

Figure 2 for Keyphrase Based Arabic Summarizer (KPAS)

Figure 3 for Keyphrase Based Arabic Summarizer (KPAS)

Figure 4 for Keyphrase Based Arabic Summarizer (KPAS)

Abstract:This paper describes a computationally inexpensive and efficient generic summarization algorithm for Arabic texts. The algorithm belongs to extractive summarization family, which reduces the problem into representative sentences identification and extraction sub-problems. Important keyphrases of the document to be summarized are identified employing combinations of statistical and linguistic features. The sentence extraction algorithm exploits keyphrases as the primary attributes to rank a sentence. The present experimental work, demonstrates different techniques for achieving various summarization goals including: informative richness, coverage of both main and auxiliary topics, and keeping redundancy to a minimum. A scoring scheme is then adopted that balances between these summarization goals. To evaluate the resulted Arabic summaries with well-established systems, aligned English/Arabic texts are used through the experiments.

* INFOS 2012, The 8th INFOS2012 International Conference on Informatics and Systems, 14-16 May, 2012

Via

Access Paper or Ask Questions

An Accurate Arabic Root-Based Lemmatizer for Information Retrieval Purposes

Mar 15, 2012

Tarek El-Shishtawy, Fatma El-Ghannam

Figure 1 for An Accurate Arabic Root-Based Lemmatizer for Information Retrieval Purposes

Figure 2 for An Accurate Arabic Root-Based Lemmatizer for Information Retrieval Purposes

Figure 3 for An Accurate Arabic Root-Based Lemmatizer for Information Retrieval Purposes

Figure 4 for An Accurate Arabic Root-Based Lemmatizer for Information Retrieval Purposes

Abstract:In spite of its robust syntax, semantic cohesion, and less ambiguity, lemma level analysis and generation does not yet focused in Arabic NLP literatures. In the current research, we propose the first non-statistical accurate Arabic lemmatizer algorithm that is suitable for information retrieval (IR) systems. The proposed lemmatizer makes use of different Arabic language knowledge resources to generate accurate lemma form and its relevant features that support IR purposes. As a POS tagger, the experimental results show that, the proposed algorithm achieves a maximum accuracy of 94.8%. For first seen documents, an accuracy of 89.15% is achieved, compared to 76.7% of up to date Stanford accurate Arabic model, for the same, dataset.

* IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 1, No 3, January 2012 ISSN (Online): 1694-0814
* 9 pages

Via

Access Paper or Ask Questions