Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

Sep 24, 2016

Adhiguna Kuncoro, Miguel Ballesteros, Lingpeng Kong, Chris Dyer, Noah A. Smith

Figure 1 for Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

Figure 2 for Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

Figure 3 for Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

Figure 4 for Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

Share this with someone who'll enjoy it:

Abstract:We introduce two first-order graph-based dependency parsers achieving a new state of the art. The first is a consensus parser built from an ensemble of independently trained greedy LSTM transition-based parsers with different random initializations. We cast this approach as minimum Bayes risk decoding (under the Hamming cost) and argue that weaker consensus within the ensemble is a useful signal of difficulty or ambiguity. The second parser is a "distillation" of the ensemble into a single model. We train the distillation parser using a structured hinge loss objective with a novel cost that incorporates ensemble uncertainty estimates for each possible attachment, thereby avoiding the intractable cross-entropy computations required by applying standard distillation objectives to problems with structured outputs. The first-order distillation parser matches or surpasses the state of the art on English, Chinese, and German.

* 10 pages. To appear at EMNLP 2016

View paper on

Share this with someone who'll enjoy it:

Title:Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser

Paper and Code