Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Erik Velldal

Compositional Generalization with Grounded Language Models

Jun 07, 2024

Sondre Wold, Étienne Simon, Lucas Georges Gabriel Charpentier, Egor V. Kostylev, Erik Velldal, Lilja Øvrelid

Figure 1 for Compositional Generalization with Grounded Language Models

Figure 2 for Compositional Generalization with Grounded Language Models

Figure 3 for Compositional Generalization with Grounded Language Models

Figure 4 for Compositional Generalization with Grounded Language Models

Abstract:Grounded language models use external sources of information, such as knowledge graphs, to meet some of the general challenges associated with pre-training. By extending previous work on compositional generalization in semantic parsing, we allow for a controlled evaluation of the degree to which these models learn and generalize from patterns in knowledge graphs. We develop a procedure for generating natural language questions paired with knowledge graphs that targets different aspects of compositionality and further avoids grounding the language models in information already encoded implicitly in their weights. We evaluate existing methods for combining language models with knowledge graphs and find them to struggle with generalization to sequences of unseen lengths and to novel combinations of seen base components. While our experimental results provide some insight into the expressive power of these models, we hope our work and released datasets motivate future research on how to better combine language models with structured knowledge representations.

* ACL 2024, Findings

Via

Access Paper or Ask Questions

It's Difficult to be Neutral -- Human and LLM-based Sentiment Annotation of Patient Comments

Apr 29, 2024

Petter Mæhlum, David Samuel, Rebecka Maria Norman, Elma Jelin, Øyvind Andresen Bjertnæs, Lilja Øvrelid, Erik Velldal

Abstract:Sentiment analysis is an important tool for aggregating patient voices, in order to provide targeted improvements in healthcare services. A prerequisite for this is the availability of in-domain data annotated for sentiment. This article documents an effort to add sentiment annotations to free-text comments in patient surveys collected by the Norwegian Institute of Public Health (NIPH). However, annotation can be a time-consuming and resource-intensive process, particularly when it requires domain expertise. We therefore also evaluate a possible alternative to human annotation, using large language models (LLMs) as annotators. We perform an extensive evaluation of the approach for two openly available pretrained LLMs for Norwegian, experimenting with different configurations of prompts and in-context learning, comparing their performance to human annotators. We find that even for zero-shot runs, models perform well above the baseline for binary sentiment, but still cannot compete with human annotators on the full dataset.

Via

Access Paper or Ask Questions

Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

Jun 05, 2023

Sondre Wold, Lilja Øvrelid, Erik Velldal

Figure 1 for Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

Figure 2 for Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

Figure 3 for Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

Figure 4 for Text-To-KG Alignment: Comparing Current Methods on Classification Tasks

Abstract:In contrast to large text corpora, knowledge graphs (KG) provide dense and structured representations of factual information. This makes them attractive for systems that supplement or ground the knowledge found in pre-trained language models with an external knowledge source. This has especially been the case for classification tasks, where recent work has focused on creating pipeline models that retrieve information from KGs like ConceptNet as additional context. Many of these models consist of multiple components, and although they differ in the number and nature of these parts, they all have in common that for some given text query, they attempt to identify and retrieve a relevant subgraph from the KG. Due to the noise and idiosyncrasies often found in KGs, it is not known how current methods compare to a scenario where the aligned subgraph is completely relevant to the query. In this work, we try to bridge this knowledge gap by reviewing current approaches to text-to-KG alignment and evaluating them on two datasets where manually created graphs are available, providing insights into the effectiveness of current methods.

* Camera ready version for MATCHING workshop at ACL 2023

Via

Access Paper or Ask Questions

NorBench -- A Benchmark for Norwegian Language Models

May 06, 2023

David Samuel, Andrey Kutuzov, Samia Touileb, Erik Velldal, Lilja Øvrelid, Egil Rønningstad, Elina Sigdel, Anna Palatkina

Figure 1 for NorBench -- A Benchmark for Norwegian Language Models

Figure 2 for NorBench -- A Benchmark for Norwegian Language Models

Figure 3 for NorBench -- A Benchmark for Norwegian Language Models

Figure 4 for NorBench -- A Benchmark for Norwegian Language Models

Abstract:We present NorBench: a streamlined suite of NLP tasks and probes for evaluating Norwegian language models (LMs) on standardized data splits and evaluation metrics. We also introduce a range of new Norwegian language models (both encoder and encoder-decoder based). Finally, we compare and analyze their performance, along with other existing LMs, across the different benchmark tests of NorBench.

* Accepted to NoDaLiDa 2023

Via

Access Paper or Ask Questions

Entity-Level Sentiment Analysis (ELSA): An exploratory task survey

Apr 27, 2023

Egil Rønningstad, Erik Velldal, Lilja Øvrelid

Figure 1 for Entity-Level Sentiment Analysis (ELSA): An exploratory task survey

Figure 2 for Entity-Level Sentiment Analysis (ELSA): An exploratory task survey

Figure 3 for Entity-Level Sentiment Analysis (ELSA): An exploratory task survey

Figure 4 for Entity-Level Sentiment Analysis (ELSA): An exploratory task survey

Abstract:This paper explores the task of identifying the overall sentiment expressed towards volitional entities (persons and organizations) in a document -- what we refer to as Entity-Level Sentiment Analysis (ELSA). While identifying sentiment conveyed towards an entity is well researched for shorter texts like tweets, we find little to no research on this specific task for longer texts with multiple mentions and opinions towards the same entity. This lack of research would be understandable if ELSA can be derived from existing tasks and models. To assess this, we annotate a set of professional reviews for their overall sentiment towards each volitional entity in the text. We sample from data already annotated for document-level, sentence-level, and target-level sentiment in a multi-domain review corpus, and our results indicate that there is no single proxy task that provides this overall sentiment we seek for the entities at a satisfactory level of performance. We present a suite of experiments aiming to assess the contribution towards ELSA provided by document-, sentence-, and target-level sentiment analysis, and provide a discussion of their shortcomings. We show that sentiment in our dataset is expressed not only with an entity mention as target, but also towards targets with a sentiment-relevant relation to a volitional entity. In our data, these relations extend beyond anaphoric coreference resolution, and our findings call for further research of the topic. Finally, we also present a survey of previous relevant work.

* Proceedings of the 29th International Conference on Computational Linguistics, 2022, pages 6773-6783

Via

Access Paper or Ask Questions

Measuring Normative and Descriptive Biases in Language Models Using Census Data

Apr 12, 2023

Samia Touileb, Lilja Øvrelid, Erik Velldal

Figure 1 for Measuring Normative and Descriptive Biases in Language Models Using Census Data

Figure 2 for Measuring Normative and Descriptive Biases in Language Models Using Census Data

Figure 3 for Measuring Normative and Descriptive Biases in Language Models Using Census Data

Figure 4 for Measuring Normative and Descriptive Biases in Language Models Using Census Data

Abstract:We investigate in this paper how distributions of occupations with respect to gender is reflected in pre-trained language models. Such distributions are not always aligned to normative ideals, nor do they necessarily reflect a descriptive assessment of reality. In this paper, we introduce an approach for measuring to what degree pre-trained language models are aligned to normative and descriptive occupational distributions. To this end, we use official demographic information about gender--occupation distributions provided by the national statistics agencies of France, Norway, United Kingdom, and the United States. We manually generate template-based sentences combining gendered pronouns and nouns with occupations, and subsequently probe a selection of ten language models covering the English, French, and Norwegian languages. The scoring system we introduce in this work is language independent, and can be used on any combination of template-based sentences, occupations, and languages. The approach could also be extended to other dimensions of national census data and other demographic variables.

* Accepted at EACL2023 -- main conference

Via

Access Paper or Ask Questions

Trained on 100 million words and still in shape: BERT meets British National Corpus

Mar 29, 2023

David Samuel, Andrey Kutuzov, Lilja Øvrelid, Erik Velldal

Abstract:While modern masked language models (LMs) are trained on ever larger corpora, we here explore the effects of down-scaling training to a modestly-sized but representative, well-balanced, and publicly available English text source -- the British National Corpus. We show that pre-training on this carefully curated corpus can reach better performance than the original BERT model. We argue that this type of corpora has great potential as a language modeling benchmark. To showcase this potential, we present fair, reproducible and data-efficient comparative studies of LMs, in which we evaluate several training objectives and model architectures and replicate previous empirical results in a systematic way. We propose an optimized LM architecture called LTG-BERT.

* Accepted to EACL 2023

Via

Access Paper or Ask Questions

Direct parsing to sentiment graphs

Mar 24, 2022

David Samuel, Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, Erik Velldal

Figure 1 for Direct parsing to sentiment graphs

Figure 2 for Direct parsing to sentiment graphs

Figure 3 for Direct parsing to sentiment graphs

Figure 4 for Direct parsing to sentiment graphs

Abstract:This paper demonstrates how a graph-based semantic parser can be applied to the task of structured sentiment analysis, directly predicting sentiment graphs from text. We advance the state of the art on 4 out of 5 standard benchmark sets. We release the source code, models and predictions.

* Accepted to ACL 2022

Via

Access Paper or Ask Questions

Structured Sentiment Analysis as Dependency Graph Parsing

May 30, 2021

Jeremy Barnes, Robin Kurtz, Stephan Oepen, Lilja Øvrelid, Erik Velldal

Figure 1 for Structured Sentiment Analysis as Dependency Graph Parsing

Figure 2 for Structured Sentiment Analysis as Dependency Graph Parsing

Figure 3 for Structured Sentiment Analysis as Dependency Graph Parsing

Figure 4 for Structured Sentiment Analysis as Dependency Graph Parsing

Abstract:Structured sentiment analysis attempts to extract full opinion tuples from a text, but over time this task has been subdivided into smaller and smaller sub-tasks, e,g,, target extraction or targeted polarity classification. We argue that this division has become counterproductive and propose a new unified framework to remedy the situation. We cast the structured sentiment problem as dependency graph parsing, where the nodes are spans of sentiment holders, targets and expressions, and the arcs are the relations between them. We perform experiments on five datasets in four languages (English, Norwegian, Basque, and Catalan) and show that this approach leads to strong improvements over state-of-the-art baselines. Our analysis shows that refining the sentiment graphs with syntactic dependency information further improves results.

* Accepted at ACL-IJCNLP 2021

Via

Access Paper or Ask Questions

Large-Scale Contextualised Language Modelling for Norwegian

Apr 13, 2021

Andrey Kutuzov, Jeremy Barnes, Erik Velldal, Lilja Øvrelid, Stephan Oepen

Figure 1 for Large-Scale Contextualised Language Modelling for Norwegian

Figure 2 for Large-Scale Contextualised Language Modelling for Norwegian

Figure 3 for Large-Scale Contextualised Language Modelling for Norwegian

Figure 4 for Large-Scale Contextualised Language Modelling for Norwegian

Abstract:We present the ongoing NorLM initiative to support the creation and use of very large contextualised language models for Norwegian (and in principle other Nordic languages), including a ready-to-use software environment, as well as an experience report for data preparation and training. This paper introduces the first large-scale monolingual language models for Norwegian, based on both the ELMo and BERT frameworks. In addition to detailing the training process, we present contrastive benchmark results on a suite of NLP tasks for Norwegian. For additional background and access to the data, models, and software, please see http://norlm.nlpl.eu

* Accepted to NoDaLiDa'2021

Via

Access Paper or Ask Questions