Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Carlos Gómez Rodríguez

A Modest Pareto Optimisation Analysis of Dependency Parsers in 2021

Jun 09, 2021

Mark Anderson, Carlos Gómez Rodríguez

Figure 1 for A Modest Pareto Optimisation Analysis of Dependency Parsers in 2021

Figure 2 for A Modest Pareto Optimisation Analysis of Dependency Parsers in 2021

Figure 3 for A Modest Pareto Optimisation Analysis of Dependency Parsers in 2021

Figure 4 for A Modest Pareto Optimisation Analysis of Dependency Parsers in 2021

Abstract:We evaluate three leading dependency parser systems from different paradigms on a small yet diverse subset of languages in terms of their accuracy-efficiency Pareto front. As we are interested in efficiency, we evaluate core parsers without pretrained language models (as these are typically huge networks and would constitute most of the compute time) or other augmentations that can be transversally applied to any of them. Biaffine parsing emerges as a well-balanced default choice, with sequence-labelling parsing being preferable if inference speed (but not training energy cost) is the priority.

* To be published in proceedings of the 17th International Conference on Parsing Technologies

Via

Access Paper or Ask Questions

A Falta de Pan, Buenas Son Tortas: The Efficacy of Predicted UPOS Tags for Low Resource UD Parsing

Jun 08, 2021

Mark Anderson, Mathieu Dehouck, Carlos Gómez Rodríguez

Figure 1 for A Falta de Pan, Buenas Son Tortas: The Efficacy of Predicted UPOS Tags for Low Resource UD Parsing

Figure 2 for A Falta de Pan, Buenas Son Tortas: The Efficacy of Predicted UPOS Tags for Low Resource UD Parsing

Figure 3 for A Falta de Pan, Buenas Son Tortas: The Efficacy of Predicted UPOS Tags for Low Resource UD Parsing

Figure 4 for A Falta de Pan, Buenas Son Tortas: The Efficacy of Predicted UPOS Tags for Low Resource UD Parsing

Abstract:We evaluate the efficacy of predicted UPOS tags as input features for dependency parsers in lower resource settings to evaluate how treebank size affects the impact tagging accuracy has on parsing performance. We do this for real low resource universal dependency treebanks, artificially low resource data with varying treebank sizes, and for very small treebanks with varying amounts of augmented data. We find that predicted UPOS tags are somewhat helpful for low resource treebanks, especially when fewer fully-annotated trees are available. We also find that this positive impact diminishes as the amount of data increases.

* To be published in proceedings of the 16th International Conference on Parsing Technologies. Earlier versions were rejected at the 16th Conference of the European Chapter of the Association for Computational Linguistics and the 23rd Nordic Conference on Computational Linguistics

Via

Access Paper or Ask Questions

Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

Jun 02, 2021

Mark Anderson, Anders Søgaard, Carlos Gómez Rodríguez

Figure 1 for Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

Figure 2 for Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

Figure 3 for Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

Figure 4 for Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance

Abstract:S{\o}gaard (2020) obtained results suggesting the fraction of trees occurring in the test data isomorphic to trees in the training set accounts for a non-trivial variation in parser performance. Similar to other statistical analyses in NLP, the results were based on evaluating linear regressions. However, the study had methodological issues and was undertaken using a small sample size leading to unreliable results. We present a replication study in which we also bin sentences by length and find that only a small subset of sentences vary in performance with respect to graph isomorphism. Further, the correlation observed between parser performance and graph isomorphism in the wild disappears when controlling for covariants. However, in a controlled experiment, where covariants are kept fixed, we do observe a strong correlation. We suggest that conclusions drawn from statistical analyses like this need to be tempered and that controlled experiments can complement them by more readily teasing factors apart.

* To appear in the Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics

Via

Access Paper or Ask Questions