Get our free extension to see links to code for papers anywhere online!


Distributional Clustering of English Words

Add code

Aug 22, 1994
Fernando Pereira, Naftali Tishby, Lillian Lee


Share this with someone who'll enjoy it:


We describe and experimentally evaluate a method for automatically clustering words according to their distribution in particular syntactic contexts. Deterministic annealing is used to find lowest distortion sets of clusters. As the annealing parameter increases, existing clusters become unstable and subdivide, yielding a hierarchical ``soft'' clustering of the data. Clusters are used as the basis for class models of word coocurrence, and the models evaluated with respect to held-out test data.

* 8 pages, appeared in the proceedings of ACL-93, Columbus, Ohio 


   Access Paper Source



Share this with someone who'll enjoy it: