Improving Data Driven Wordclass Tagging by System Combination

Jul 31, 1998
Hans van Halteren, Jakub Zavrel, Walter Daelemans

Figure 2 for Improving Data Driven Wordclass Tagging by System Combination

Share this with someone who'll enjoy it:

In this paper we examine how the differences in modelling between different data driven systems performing the same NLP task can be exploited to yield a higher accuracy than the best individual system. We do this by means of an experiment involving the task of morpho-syntactic wordclass tagging. Four well-known tagger generators (Hidden Markov Model, Memory-Based, Transformation Rules and Maximum Entropy) are trained on the same corpus data. After comparison, their outputs are combined using several voting strategies and second stage classifiers. All combination taggers outperform their best component, with the best combination showing a 19.1% lower error rate than the best individual tagger.

* Proceedings of the 17th International Conference on Computational Linguistics (COLING-ACL'98)
* 7 pages, LaTeX, uses acl.bst, colacl.sty

Access Paper Source

Share this with someone who'll enjoy it: