Get our free extension to see links to code for papers anywhere online!


Empirical Methods for Compound Splitting

Add code

Feb 22, 2003
Philipp Koehn, Kevin Knight


Share this with someone who'll enjoy it:


Compounded words are a challenge for NLP applications such as machine translation (MT). We introduce methods to learn splitting rules from monolingual and parallel corpora. We evaluate them against a gold standard and measure their impact on performance of statistical MT systems. Results show accuracy of 99.1% and performance gains for MT of 0.039 BLEU on a German-English noun phrase translation task.

* 8 pages, 2 figures. Published at EACL 2003 


   Access Paper Source



Share this with someone who'll enjoy it: