Get our free extension to see links to code for papers anywhere online!


Variable Word Rate N-grams

Add code

Mar 29, 2000
Yoshihiko Gotoh, Steve Renals


Share this with someone who'll enjoy it:


The rate of occurrence of words is not uniform but varies from document to document. Despite this observation, parameters for conventional n-gram language models are usually derived using the assumption of a constant word rate. In this paper we investigate the use of variable word rate assumption, modelled by a Poisson distribution or a continuous mixture of Poissons. We present an approach to estimating the relative frequencies of words or n-grams taking prior information of their occurrences into account. Discounting and smoothing schemes are also considered. Using the Broadcast News task, the approach demonstrates a reduction of perplexity up to 10%.

* 4 pages, 4 figures, ICASSP-2000 


   Access Paper Source



Share this with someone who'll enjoy it: