Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Wiegand

Relation Extraction or Pattern Matching? Unravelling the Generalisation Limits of Language Models for Biographical RE

May 18, 2025

Varvara Arzt, Allan Hanbury, Michael Wiegand, Gábor Recski, Terra Blevins

Abstract:Analysing the generalisation capabilities of relation extraction (RE) models is crucial for assessing whether they learn robust relational patterns or rely on spurious correlations. Our cross-dataset experiments find that RE models struggle with unseen data, even within similar domains. Notably, higher intra-dataset performance does not indicate better transferability, instead often signaling overfitting to dataset-specific artefacts. Our results also show that data quality, rather than lexical similarity, is key to robust transfer, and the choice of optimal adaptation strategy depends on the quality of data available: while fine-tuning yields the best cross-dataset performance with high-quality data, few-shot in-context learning (ICL) is more effective with noisier data. However, even in these cases, zero-shot baselines occasionally outperform all cross-dataset results. Structural issues in RE benchmarks, such as single-relation per sample constraints and non-standardised negative class definitions, further hinder model transferability.

Via

Access Paper or Ask Questions

Effective Slot Filling Based on Shallow Distant Supervision Methods

Jan 06, 2014

Benjamin Roth, Tassilo Barth, Michael Wiegand, Mittul Singh, Dietrich Klakow

Figure 1 for Effective Slot Filling Based on Shallow Distant Supervision Methods

Figure 2 for Effective Slot Filling Based on Shallow Distant Supervision Methods

Figure 3 for Effective Slot Filling Based on Shallow Distant Supervision Methods

Figure 4 for Effective Slot Filling Based on Shallow Distant Supervision Methods

Abstract:Spoken Language Systems at Saarland University (LSV) participated this year with 5 runs at the TAC KBP English slot filling track. Effective algorithms for all parts of the pipeline, from document retrieval to relation prediction and response post-processing, are bundled in a modular end-to-end relation extraction system called RelationFactory. The main run solely focuses on shallow techniques and achieved significant improvements over LSV's last year's system, while using the same training data and patterns. Improvements mainly have been obtained by a feature representation focusing on surface skip n-grams and improved scoring for extracted distant supervision patterns. Important factors for effective extraction are the training and tuning scheme for distant supervision classifiers, and the query expansion by a translation model based on Wikipedia links. In the TAC KBP 2013 English Slotfilling evaluation, the submitted main run of the LSV RelationFactory system achieved the top-ranked F1-score of 37.3%.

* to be published in: Proceedings of the Sixth Text Analysis Conference (TAC 2013)

Via

Access Paper or Ask Questions