Get our free extension to see links to code for papers anywhere online!


Inducing Probabilistic Grammars by Bayesian Model Merging

Add code

Sep 13, 1994
Andreas Stolcke, Stephen M. Omohundro


Share this with someone who'll enjoy it:


We describe a framework for inducing probabilistic grammars from corpora of positive samples. First, samples are {\em incorporated} by adding ad-hoc rules to a working grammar; subsequently, elements of the model (such as states or nonterminals) are {\em merged} to achieve generalization and a more compact representation. The choice of what to merge and when to stop is governed by the Bayesian posterior probability of the grammar given the data, which formalizes a trade-off between a close fit to the data and a default preference for simpler models (`Occam's Razor'). The general scheme is illustrated using three types of probabilistic grammars: Hidden Markov models, class-based $n$-grams, and stochastic context-free grammars.

* To appear in Grammatical Inference and Applications, Second International Colloquium on Grammatical Inference; Springer Verlag, 1994. 13 pages 


   Access Paper Source



Share this with someone who'll enjoy it: