Alert button
Picture for António Branco

António Branco

Alert button

Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*

May 11, 2023
João Rodrigues, Luís Gomes, João Silva, António Branco, Rodrigo Santos, Henrique Lopes Cardoso, Tomás Osório

Figure 1 for Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*
Figure 2 for Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*
Figure 3 for Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*
Figure 4 for Advancing Neural Encoding of Portuguese with Transformer Albertina PT-*

To advance the neural encoding of Portuguese (PT), and a fortiori the technological preparation of this language for the digital age, we developed a Transformer-based foundation model that sets a new state of the art in this respect for two of its variants, namely European Portuguese from Portugal (PT-PT) and American Portuguese from Brazil (PT-BR). To develop this encoder, which we named Albertina PT-*, a strong model was used as a starting point, DeBERTa, and its pre-training was done over data sets of Portuguese, namely over a data set we gathered for PT-PT and over the brWaC corpus for PT-BR. The performance of Albertina and competing models was assessed by evaluating them on prominent downstream language processing tasks adapted for Portuguese. Both Albertina PT-PT and PT-BR versions are distributed free of charge and under the most permissive license possible and can be run on consumer-grade hardware, thus seeking to contribute to the advancement of research and innovation in language technology for Portuguese.

Viaarxiv icon

Transfer Learning of Lexical Semantic Families for Argumentative Discourse Units Identification

Sep 06, 2022
João Rodrigues, Ruben Branco, António Branco

Figure 1 for Transfer Learning of Lexical Semantic Families for Argumentative Discourse Units Identification
Figure 2 for Transfer Learning of Lexical Semantic Families for Argumentative Discourse Units Identification
Figure 3 for Transfer Learning of Lexical Semantic Families for Argumentative Discourse Units Identification
Figure 4 for Transfer Learning of Lexical Semantic Families for Argumentative Discourse Units Identification

Argument mining tasks require an informed range of low to high complexity linguistic phenomena and commonsense knowledge. Previous work has shown that pre-trained language models are highly effective at encoding syntactic and semantic linguistic phenomena when applied with transfer learning techniques and built on different pre-training objectives. It remains an issue of how much the existing pre-trained language models encompass the complexity of argument mining tasks. We rely on experimentation to shed light on how language models obtained from different lexical semantic families leverage the performance of the identification of argumentative discourse units task. Experimental results show that transfer learning techniques are beneficial to the task and that current methods may be insufficient to leverage commonsense knowledge from different lexical semantic families.

Viaarxiv icon

Anaphoric Binding: an integrated overview

Mar 11, 2021
António Branco

Figure 1 for Anaphoric Binding: an integrated overview

The interpretation of anaphors depends on their antecedents as the semantic value that an anaphor eventually conveys is co-specified by the value of its antecedent. Interestingly, when occurring in a given syntactic position, different anaphors may have different sets of admissible antecedents. Such differences are the basis for the categorization of anaphoric expressions according to their anaphoric capacity, being important to determine what are the sets of admissible antecedents and how to represent and process this anaphoric capacity for each type of anaphor. From an empirical perspective, these constraints stem from what appears as quite cogent generalisations and exhibit a universal character, given their cross linguistic validity. From a conceptual point of view, in turn, the relations among binding constraints involve non-trivial cross symmetry, which lends them a modular nature and provides further strength to the plausibility of their universal character. This kind of anaphoric binding constraints appears thus as a most significant subset of natural language knowledge, usually referred to as binding theory. This paper provides an integrated overview of these constraints holding on the pairing of nominal anaphors with their admissible antecedents that are based on grammatical relations and structure. Along with the increasing interest on neuro-symbolic approaches to natural language, this paper seeks to contribute to revive the interest on this most intriguing research topic.

Viaarxiv icon

Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness

Nov 16, 2020
António Branco, João Rodrigues, Małgorzata Salawa, Ruben Branco, Chakaveh Saedi

Figure 1 for Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness
Figure 2 for Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness
Figure 3 for Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness
Figure 4 for Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness

Lexical semantics theories differ in advocating that the meaning of words is represented as an inference graph, a feature mapping or a vector space, thus raising the question: is it the case that one of these approaches is superior to the others in representing lexical semantics appropriately? Or in its non antagonistic counterpart: could there be a unified account of lexical semantics where these approaches seamlessly emerge as (partial) renderings of (different) aspects of a core semantic knowledge base? In this paper, we contribute to these research questions with a number of experiments that systematically probe different lexical semantics theories for their levels of cognitive plausibility and of technological usefulness. The empirical findings obtained from these experiments advance our insight on lexical semantics as the feature-based approach emerges as superior to the other ones, and arguably also move us closer to finding answers to the research questions above.

Viaarxiv icon

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Mar 30, 2020
Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim Köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon

Figure 1 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
Figure 2 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe's specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI, including many opportunities, synergies but also misconceptions, has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions.

* Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear 
Viaarxiv icon

Merging External Bilingual Pairs into Neural Machine Translation

Dec 02, 2019
Tao Wang, Shaohui Kuang, Deyi Xiong, António Branco

Figure 1 for Merging External Bilingual Pairs into Neural Machine Translation
Figure 2 for Merging External Bilingual Pairs into Neural Machine Translation
Figure 3 for Merging External Bilingual Pairs into Neural Machine Translation
Figure 4 for Merging External Bilingual Pairs into Neural Machine Translation

As neural machine translation (NMT) is not easily amenable to explicit correction of errors, incorporating pre-specified translations into NMT is widely regarded as a non-trivial challenge. In this paper, we propose and explore three methods to endow NMT with pre-specified bilingual pairs. Instead, for instance, of modifying the beam search algorithm during decoding or making complex modifications to the attention mechanism --- mainstream approaches to tackling this challenge ---, we experiment with the training data being appropriately pre-processed to add information about pre-specified translations. Extra embeddings are also used to distinguish pre-specified tokens from the other tokens. Extensive experimentation and analysis indicate that over 99% of the pre-specified phrases are successfully translated (given a 85% baseline) and that there is also a substantive improvement in translation quality with the methods explored here.

* 7 pages, 3 figures, 5 tables 
Viaarxiv icon

Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings

May 10, 2018
Shaohui Kuang, Junhui Li, António Branco, Weihua Luo, Deyi Xiong

Figure 1 for Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
Figure 2 for Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
Figure 3 for Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings
Figure 4 for Attention Focusing for Neural Machine Translation by Bridging Source and Target Embeddings

In neural machine translation, a source sequence of words is encoded into a vector from which a target sequence is generated in the decoding phase. Differently from statistical machine translation, the associations between source words and their possible target counterparts are not explicitly stored. Source and target words are at the two ends of a long information processing procedure, mediated by hidden states at both the source encoding and the target decoding phases. This makes it possible that a source word is incorrectly translated into a target word that is not any of its admissible equivalent counterparts in the target language. In this paper, we seek to somewhat shorten the distance between source and target words in that procedure, and thus strengthen their association, by means of a method we term bridging source and target word embeddings. We experiment with three strategies: (1) a source-side bridging model, where source word embeddings are moved one step closer to the output target sequence; (2) a target-side bridging model, which explores the more relevant source word embeddings for the prediction of the target sequence; and (3) a direct bridging model, which directly connects source and target word embeddings seeking to minimize errors in the translation of ones by the others. Experiments and analysis presented in this paper demonstrate that the proposed bridging models are able to significantly improve quality of both sentence translation, in general, and alignment and translation of individual source words with target words, in particular.

* 9 pages, 6 figures. Accepted by ACL2018 
Viaarxiv icon