Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nathaniel Krasner

Cross-Lingual Representation Alignment Through Contrastive Image-Caption Tuning

May 19, 2025

Nathaniel Krasner, Nicholas Lanuzo, Antonios Anastasopoulos

Abstract:Multilingual alignment of sentence representations has mostly required bitexts to bridge the gap between languages. We investigate whether visual information can bridge this gap instead. Image caption datasets are very easy to create without requiring multilingual expertise, so this offers a more efficient alternative for low-resource languages. We find that multilingual image-caption alignment can implicitly align the text representations between languages, languages unseen by the encoder in pretraining can be incorporated into this alignment post-hoc, and these aligned representations are usable for cross-lingual Natural Language Understanding (NLU) and bitext retrieval.

* Accepted to ACL 2025 Main Conference

Via

Access Paper or Ask Questions

Revisiting the Effects of Leakage on Dependency Parsing

Mar 24, 2022

Nathaniel Krasner, Miriam Wanner, Antonios Anastasopoulos

Figure 1 for Revisiting the Effects of Leakage on Dependency Parsing

Figure 2 for Revisiting the Effects of Leakage on Dependency Parsing

Figure 3 for Revisiting the Effects of Leakage on Dependency Parsing

Figure 4 for Revisiting the Effects of Leakage on Dependency Parsing

Abstract:Recent work by S{\o}gaard (2020) showed that, treebank size aside, overlap between training and test graphs (termed leakage) explains more of the observed variation in dependency parsing performance than other explanations. In this work we revisit this claim, testing it on more models and languages. We find that it only holds for zero-shot cross-lingual settings. We then propose a more fine-grained measure of such leakage which, unlike the original measure, not only explains but also correlates with observed performance variation. Code and data are available here: https://github.com/miriamwanner/reu-nlp-project

* to be presented at ACL'22 Findings

Via

Access Paper or Ask Questions