Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!
We explore how to improve machine translation systems by adding more translation data in situations where we already have substantial resources. The main challenge is how to buck the trend of diminishing returns that is commonly encountered. We present an active learning-style data solicitation algorithm to meet this challenge. We test it, gathering annotations via Amazon Mechanical Turk, and find that we get an order of magnitude increase in performance rates of improvement.
* In Proceedings of the 48th Annual Meeting of the Association for
Computational Linguistics, pages 854-864, Uppsala, Sweden, July 2010.
Association for Computational Linguistics * 11 pages, 14 figures; appeared in Proceedings of the 48th Annual
Meeting of the Association for Computational Linguistics, July 2010