Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

A Machine Learning Approach for the Identification of Bengali Noun-Noun Compound Multiword Expressions

Jan 25, 2014
Vivekananda Gayen, Kamal Sarkar

Share this with someone who'll enjoy it:

This paper presents a machine learning approach for identification of Bengali multiword expressions (MWE) which are bigram nominal compounds. Our proposed approach has two steps: (1) candidate extraction using chunk information and various heuristic rules and (2) training the machine learning algorithm called Random Forest to classify the candidates into two groups: bigram nominal compound MWE or not bigram nominal compound MWE. A variety of association measures, syntactic and linguistic clues and a set of WordNet-based similarity features have been used for our MWE identification task. The approach presented in this paper can be used to identify bigram nominal compound MWE in Bengali running text.

* In Proceedings of ICON-2013: 10th International Conference on Natural Language Processing, pp 290-296 

   Access Paper Source

Share this with someone who'll enjoy it: