Alert button
Picture for Steven Schockaert

Steven Schockaert

Alert button

What do Deck Chairs and Sun Hats Have in Common? Uncovering Shared Properties in Large Concept Vocabularies

Oct 23, 2023
Amit Gajbhiye, Zied Bouraoui, Na Li, Usashi Chatterjee, Luis Espinosa Anke, Steven Schockaert

Figure 1 for What do Deck Chairs and Sun Hats Have in Common? Uncovering Shared Properties in Large Concept Vocabularies
Figure 2 for What do Deck Chairs and Sun Hats Have in Common? Uncovering Shared Properties in Large Concept Vocabularies
Figure 3 for What do Deck Chairs and Sun Hats Have in Common? Uncovering Shared Properties in Large Concept Vocabularies
Figure 4 for What do Deck Chairs and Sun Hats Have in Common? Uncovering Shared Properties in Large Concept Vocabularies

Concepts play a central role in many applications. This includes settings where concepts have to be modelled in the absence of sentence context. Previous work has therefore focused on distilling decontextualised concept embeddings from language models. But concepts can be modelled from different perspectives, whereas concept embeddings typically mostly capture taxonomic structure. To address this issue, we propose a strategy for identifying what different concepts, from a potentially large concept vocabulary, have in common with others. We then represent concepts in terms of the properties they share with the other concepts. To demonstrate the practical usefulness of this way of modelling concepts, we consider the task of ultra-fine entity typing, which is a challenging multi-label classification problem. We show that by augmenting the label set with shared properties, we can improve the performance of the state-of-the-art models for this task.

* Accepted for EMNLP 2023 
Viaarxiv icon

Solving Hard Analogy Questions with Relation Embedding Chains

Oct 18, 2023
Nitesh Kumar, Steven Schockaert

Figure 1 for Solving Hard Analogy Questions with Relation Embedding Chains
Figure 2 for Solving Hard Analogy Questions with Relation Embedding Chains
Figure 3 for Solving Hard Analogy Questions with Relation Embedding Chains
Figure 4 for Solving Hard Analogy Questions with Relation Embedding Chains

Modelling how concepts are related is a central topic in Lexical Semantics. A common strategy is to rely on knowledge graphs (KGs) such as ConceptNet, and to model the relation between two concepts as a set of paths. However, KGs are limited to a fixed set of relation types, and they are incomplete and often noisy. Another strategy is to distill relation embeddings from a fine-tuned language model. However, this is less suitable for words that are only indirectly related and it does not readily allow us to incorporate structured domain knowledge. In this paper, we aim to combine the best of both worlds. We model relations as paths but associate their edges with relation embeddings. The paths are obtained by first identifying suitable intermediate words and then selecting those words for which informative relation embeddings can be obtained. We empirically show that our proposed representations are useful for solving hard analogy questions.

Viaarxiv icon

Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces

Oct 09, 2023
Usashi Chatterjee, Amit Gajbhiye, Steven Schockaert

Figure 1 for Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces
Figure 2 for Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces
Figure 3 for Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces
Figure 4 for Cabbage Sweeter than Cake? Analysing the Potential of Large Language Models for Learning Conceptual Spaces

The theory of Conceptual Spaces is an influential cognitive-linguistic framework for representing the meaning of concepts. Conceptual spaces are constructed from a set of quality dimensions, which essentially correspond to primitive perceptual features (e.g. hue or size). These quality dimensions are usually learned from human judgements, which means that applications of conceptual spaces tend to be limited to narrow domains (e.g. modelling colour or taste). Encouraged by recent findings about the ability of Large Language Models (LLMs) to learn perceptually grounded representations, we explore the potential of such models for learning conceptual spaces. Our experiments show that LLMs can indeed be used for learning meaningful representations to some extent. However, we also find that fine-tuned models of the BERT family are able to match or even outperform the largest GPT-3 model, despite being 2 to 3 orders of magnitude smaller.

* Accepted for EMNLP 2023 
Viaarxiv icon

RelBERT: Embedding Relations with Language Models

Oct 08, 2023
Asahi Ushio, Jose Camacho-Collados, Steven Schockaert

Figure 1 for RelBERT: Embedding Relations with Language Models
Figure 2 for RelBERT: Embedding Relations with Language Models
Figure 3 for RelBERT: Embedding Relations with Language Models
Figure 4 for RelBERT: Embedding Relations with Language Models

Many applications need access to background knowledge about how different concepts and entities are related. Although Knowledge Graphs (KG) and Large Language Models (LLM) can address this need to some extent, KGs are inevitably incomplete and their relational schema is often too coarse-grained, while LLMs are inefficient and difficult to control. As an alternative, we propose to extract relation embeddings from relatively small language models. In particular, we show that masked language models such as RoBERTa can be straightforwardly fine-tuned for this purpose, using only a small amount of training data. The resulting model, which we call RelBERT, captures relational similarity in a surprisingly fine-grained way, allowing us to set a new state-of-the-art in analogy benchmarks. Crucially, RelBERT is capable of modelling relations that go well beyond what the model has seen during training. For instance, we obtained strong results on relations between named entities with a model that was only trained on lexical relations between concepts, and we observed that RelBERT can recognise morphological analogies despite not being trained on such examples. Overall, we find that RelBERT significantly outperforms strategies based on prompting language models that are several orders of magnitude larger, including recent GPT-based models and open source models.

Viaarxiv icon

RAGAS: Automated Evaluation of Retrieval Augmented Generation

Sep 26, 2023
Shahul Es, Jithin James, Luis Espinosa-Anke, Steven Schockaert

Figure 1 for RAGAS: Automated Evaluation of Retrieval Augmented Generation
Figure 2 for RAGAS: Automated Evaluation of Retrieval Augmented Generation
Figure 3 for RAGAS: Automated Evaluation of Retrieval Augmented Generation
Figure 4 for RAGAS: Automated Evaluation of Retrieval Augmented Generation

We introduce RAGAs (Retrieval Augmented Generation Assessment), a framework for reference-free evaluation of Retrieval Augmented Generation (RAG) pipelines. RAG systems are composed of a retrieval and an LLM based generation module, and provide LLMs with knowledge from a reference textual database, which enables them to act as a natural language layer between a user and textual databases, reducing the risk of hallucinations. Evaluating RAG architectures is, however, challenging because there are several dimensions to consider: the ability of the retrieval system to identify relevant and focused context passages, the ability of the LLM to exploit such passages in a faithful way, or the quality of the generation itself. With RAGAs, we put forward a suite of metrics which can be used to evaluate these different dimensions \textit{without having to rely on ground truth human annotations}. We posit that such a framework can crucially contribute to faster evaluation cycles of RAG architectures, which is especially important given the fast adoption of LLMs.

* Reference-free (not tied to having ground truth available) evaluation framework for retrieval agumented generation 
Viaarxiv icon

Inductive Knowledge Graph Completion with GNNs and Rules: An Analysis

Aug 14, 2023
Akash Anil, Víctor Gutiérrez-Basulto, Yazmín Ibañéz-García, Steven Schockaert

Figure 1 for Inductive Knowledge Graph Completion with GNNs and Rules: An Analysis
Figure 2 for Inductive Knowledge Graph Completion with GNNs and Rules: An Analysis
Figure 3 for Inductive Knowledge Graph Completion with GNNs and Rules: An Analysis
Figure 4 for Inductive Knowledge Graph Completion with GNNs and Rules: An Analysis

The task of inductive knowledge graph completion requires models to learn inference patterns from a training graph, which can then be used to make predictions on a disjoint test graph. Rule-based methods seem like a natural fit for this task, but in practice they significantly underperform state-of-the-art methods based on Graph Neural Networks (GNNs), such as NBFNet. We hypothesise that the underperformance of rule-based methods is due to two factors: (i) implausible entities are not ranked at all and (ii) only the most informative path is taken into account when determining the confidence in a given link prediction answer. To analyse the impact of these factors, we study a number of variants of a rule-based approach, which are specifically aimed at addressing the aforementioned issues. We find that the resulting models can achieve a performance which is close to that of NBFNet. Crucially, the considered variants only use a small fraction of the evidence that NBFNet relies on, which means that they largely keep the interpretability advantage of rule-based methods. Moreover, we show that a further variant, which does look at the full KG, consistently outperforms NBFNet.

Viaarxiv icon

A RelEntLess Benchmark for Modelling Graded Relations between Named Entities

May 24, 2023
Asahi Ushio, Jose Camacho Collados, Steven Schockaert

Figure 1 for A RelEntLess Benchmark for Modelling Graded Relations between Named Entities
Figure 2 for A RelEntLess Benchmark for Modelling Graded Relations between Named Entities
Figure 3 for A RelEntLess Benchmark for Modelling Graded Relations between Named Entities
Figure 4 for A RelEntLess Benchmark for Modelling Graded Relations between Named Entities

Relations such as "is influenced by", "is known for" or "is a competitor of" are inherently graded: we can rank entity pairs based on how well they satisfy these relations, but it is hard to draw a line between those pairs that satisfy them and those that do not. Such graded relations play a central role in many applications, yet they are typically not covered by existing Knowledge Graphs. In this paper, we consider the possibility of using Large Language Models (LLMs) to fill this gap. To this end, we introduce a new benchmark, in which entity pairs have to be ranked according to how much they satisfy a given graded relation. The task is formulated as a few-shot ranking problem, where models only have access to a description of the relation and five prototypical instances. We use the proposed benchmark to evaluate state-of-the-art relation embedding strategies as well as several recent LLMs, covering both publicly available LLMs and closed models such as GPT-4. Overall, we find a strong correlation between model size and performance, with smaller Language Models struggling to outperform a naive baseline. The results of the largest Flan-T5 and OPT models are remarkably strong, although a clear gap with human performance remains.

Viaarxiv icon

EnCore: Pre-Training Entity Encoders using Coreference Chains

May 22, 2023
Frank Mtumbuka, Steven Schockaert

Figure 1 for EnCore: Pre-Training Entity Encoders using Coreference Chains
Figure 2 for EnCore: Pre-Training Entity Encoders using Coreference Chains
Figure 3 for EnCore: Pre-Training Entity Encoders using Coreference Chains
Figure 4 for EnCore: Pre-Training Entity Encoders using Coreference Chains

Entity typing is the task of assigning semantic types to the entities that are mentioned in a text. Since obtaining sufficient amounts of manual annotations is expensive, current state-of-the-art methods are typically trained on automatically labelled datasets, e.g. by exploiting links between Wikipedia pages. In this paper, we propose to use coreference chains as an additional supervision signal. Specifically, we pre-train an entity encoder using a contrastive loss, such that entity embeddings of coreferring entities are more similar to each other than to the embeddings of other entities. Since this strategy is not tied to Wikipedia, we can pre-train our entity encoder on other genres than encyclopedic text and on larger amounts of data. Our experimental results show that the proposed pre-training strategy allows us to improve the state-of-the-art in fine-grained entity typing, provided that only high-quality coreference links are exploited.

Viaarxiv icon

Ultra-Fine Entity Typing with Prior Knowledge about Labels: A Simple Clustering Based Strategy

May 22, 2023
Na Li, Zied Bouraoui, Steven Schockaert

Figure 1 for Ultra-Fine Entity Typing with Prior Knowledge about Labels: A Simple Clustering Based Strategy
Figure 2 for Ultra-Fine Entity Typing with Prior Knowledge about Labels: A Simple Clustering Based Strategy
Figure 3 for Ultra-Fine Entity Typing with Prior Knowledge about Labels: A Simple Clustering Based Strategy
Figure 4 for Ultra-Fine Entity Typing with Prior Knowledge about Labels: A Simple Clustering Based Strategy

Ultra-fine entity typing (UFET) is the task of inferring the semantic types, from a large set of fine-grained candidates, that apply to a given entity mention. This task is especially challenging because we only have a small number of training examples for many of the types, even with distant supervision strategies. State-of-the-art models, therefore, have to rely on prior knowledge about the type labels in some way. In this paper, we show that the performance of existing methods can be improved using a simple technique: we use pre-trained label embeddings to cluster the labels into semantic domains and then treat these domains as additional types. We show that this strategy consistently leads to improved results, as long as high-quality label embeddings are used. We furthermore use the label clusters as part of a simple post-processing technique, which results in further performance gains. Both strategies treat the UFET model as a black box and can thus straightforwardly be used to improve a wide range of existing models.

Viaarxiv icon

Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models

May 16, 2023
Na Li, Hanane Kteich, Zied Bouraoui, Steven Schockaert

Figure 1 for Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models
Figure 2 for Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models
Figure 3 for Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models
Figure 4 for Distilling Semantic Concept Embeddings from Contrastively Fine-Tuned Language Models

Learning vectors that capture the meaning of concepts remains a fundamental challenge. Somewhat surprisingly, perhaps, pre-trained language models have thus far only enabled modest improvements to the quality of such concept embeddings. Current strategies for using language models typically represent a concept by averaging the contextualised representations of its mentions in some corpus. This is potentially sub-optimal for at least two reasons. First, contextualised word vectors have an unusual geometry, which hampers downstream tasks. Second, concept embeddings should capture the semantic properties of concepts, whereas contextualised word vectors are also affected by other factors. To address these issues, we propose two contrastive learning strategies, based on the view that whenever two sentences reveal similar properties, the corresponding contextualised vectors should also be similar. One strategy is fully unsupervised, estimating the properties which are expressed in a sentence from the neighbourhood structure of the contextualised word embeddings. The second strategy instead relies on a distant supervision signal from ConceptNet. Our experimental results show that the resulting vectors substantially outperform existing concept embeddings in predicting the semantic properties of concepts, with the ConceptNet-based strategy achieving the best results. These findings are furthermore confirmed in a clustering task and in the downstream task of ontology completion.

Viaarxiv icon