Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Adrián Bazaga

Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models

Apr 07, 2025

Adrián Bazaga, Rexhina Blloshmi, Bill Byrne, Adrià de Gispert

Abstract:Large Language Models (LLMs) have emerged as powerful tools for generating coherent text, understanding context, and performing reasoning tasks. However, they struggle with temporal reasoning, which requires processing time-related information such as event sequencing, durations, and inter-temporal relationships. These capabilities are critical for applications including question answering, scheduling, and historical analysis. In this paper, we introduce TISER, a novel framework that enhances the temporal reasoning abilities of LLMs through a multi-stage process that combines timeline construction with iterative self-reflection. Our approach leverages test-time scaling to extend the length of reasoning traces, enabling models to capture complex temporal dependencies more effectively. This strategy not only boosts reasoning accuracy but also improves the traceability of the inference process. Experimental results demonstrate state-of-the-art performance across multiple benchmarks, including out-of-distribution test sets, and reveal that TISER enables smaller open-source models to surpass larger closed-weight models on challenging temporal reasoning tasks.

Via

Access Paper or Ask Questions

FLUID-LLM: Learning Computational Fluid Dynamics with Spatiotemporal-aware Large Language Models

Jun 06, 2024

Max Zhu, Adrián Bazaga, Pietro Liò

Abstract:Learning computational fluid dynamics (CFD) traditionally relies on computationally intensive simulations of the Navier-Stokes equations. Recently, large language models (LLMs) have shown remarkable pattern recognition and reasoning abilities in natural language processing (NLP) and computer vision (CV). However, these models struggle with the complex geometries inherent in fluid dynamics. We introduce FLUID-LLM, a novel framework combining pre-trained LLMs with spatiotemporal-aware encoding to predict unsteady fluid dynamics. Our approach leverages the temporal autoregressive abilities of LLMs alongside spatial-aware layers, bridging the gap between previous CFD prediction methods. Evaluations on standard benchmarks reveal significant performance improvements across various fluid datasets. Our results demonstrate that FLUID-LLM effectively integrates spatiotemporal information into pre-trained LLMs, enhancing CFD task performance.

Via

Access Paper or Ask Questions

TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

Jun 03, 2024

Andrei Margeloiu, Adrián Bazaga, Nikola Simidjievski, Pietro Liò, Mateja Jamnik

Figure 1 for TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

Figure 2 for TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

Figure 3 for TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

Figure 4 for TabMDA: Tabular Manifold Data Augmentation for Any Classifier using Transformers with In-context Subsetting

Abstract:Tabular data is prevalent in many critical domains, yet it is often challenging to acquire in large quantities. This scarcity usually results in poor performance of machine learning models on such data. Data augmentation, a common strategy for performance improvement in vision and language tasks, typically underperforms for tabular data due to the lack of explicit symmetries in the input space. To overcome this challenge, we introduce TabMDA, a novel method for manifold data augmentation on tabular data. This method utilises a pre-trained in-context model, such as TabPFN, to map the data into a manifold space. TabMDA performs label-invariant transformations by encoding the data multiple times with varied contexts. This process explores the manifold of the underlying in-context models, thereby enlarging the training dataset. TabMDA is a training-free method, making it applicable to any classifier. We evaluate TabMDA on five standard classifiers and observe significant performance improvements across various tabular datasets. Our results demonstrate that TabMDA provides an effective way to leverage information from pre-trained in-context models to enhance the performance of downstream classifiers.

Via

Access Paper or Ask Questions

HyperBERT: Mixing Hypergraph-Aware Layers with Language Models for Node Classification on Text-Attributed Hypergraphs

Feb 13, 2024

Adrián Bazaga, Pietro Liò, Gos Micklem

Abstract:Hypergraphs are marked by complex topology, expressing higher-order interactions among multiple entities with hyperedges. Lately, hypergraph-based deep learning methods to learn informative data representations for the problem of node classification on text-attributed hypergraphs have garnered increasing research attention. However, existing methods struggle to simultaneously capture the full extent of hypergraph structural information and the rich linguistic attributes inherent in the nodes attributes, which largely hampers their effectiveness and generalizability. To overcome these challenges, we explore ways to further augment a pretrained BERT model with specialized hypergraph-aware layers for the task of node classification. Such layers introduce higher-order structural inductive bias into the language model, thus improving the model's capacity to harness both higher-order context information from the hypergraph structure and semantic information present in text. In this paper, we propose a new architecture, HyperBERT, a mixed text-hypergraph model which simultaneously models hypergraph relational structure while maintaining the high-quality text encoding capabilities of a pre-trained BERT. Notably, HyperBERT presents results that achieve a new state-of-the-art on five challenging text-attributed hypergraph node classification benchmarks.

* 11 pages, 2 figures

Via

Access Paper or Ask Questions

Language Model Knowledge Distillation for Efficient Question Answering in Spanish

Dec 07, 2023

Adrián Bazaga, Pietro Liò, Gos Micklem

Abstract:Recent advances in the development of pre-trained Spanish language models has led to significant progress in many Natural Language Processing (NLP) tasks, such as question answering. However, the lack of efficient models imposes a barrier for the adoption of such models in resource-constrained environments. Therefore, smaller distilled models for the Spanish language could be proven to be highly scalable and facilitate their further adoption on a variety of tasks and scenarios. In this work, we take one step in this direction by developing SpanishTinyRoBERTa, a compressed language model based on RoBERTa for efficient question answering in Spanish. To achieve this, we employ knowledge distillation from a large model onto a lighter model that allows for a wider implementation, even in areas with limited computational resources, whilst attaining negligible performance sacrifice. Our experiments show that the dense distilled model can still preserve the performance of its larger counterpart, while significantly increasing inference speedup. This work serves as a starting point for further research and investigation of model compression efforts for Spanish language models across various NLP tasks.

* 6 pages, 2 tables

Via

Access Paper or Ask Questions

SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation

Oct 27, 2023

Adrián Bazaga, Pietro Liò, Gos Micklem

Figure 1 for SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation

Figure 2 for SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation

Figure 3 for SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation

Figure 4 for SQLformer: Deep Auto-Regressive Query Graph Generation for Text-to-SQL Translation

Abstract:In recent years, there has been growing interest in text-to-SQL translation, which is the task of converting natural language questions into executable SQL queries. This technology is important for its potential to democratize data extraction from databases. However, some of its key hurdles include domain generalisation, which is the ability to adapt to previously unseen databases, and alignment of natural language questions with the corresponding SQL queries. To overcome these challenges, we introduce SQLformer, a novel Transformer architecture specifically crafted to perform text-to-SQL translation tasks. Our model predicts SQL queries as abstract syntax trees (ASTs) in an autoregressive way, incorporating structural inductive bias in the encoder and decoder layers. This bias, guided by database table and column selection, aids the decoder in generating SQL query ASTs represented as graphs in a Breadth-First Search canonical order. Comprehensive experiments illustrate the state-of-the-art performance of SQLformer in the challenging text-to-SQL Spider benchmark. Our implementation is available at https://github.com/AdrianBZG/SQLformer

* 11 pages, 4 figures

Via

Access Paper or Ask Questions

Unsupervised Fact Verification by Language Model Distillation

Sep 28, 2023

Adrián Bazaga, Pietro Liò, Gos Micklem

Figure 1 for Unsupervised Fact Verification by Language Model Distillation

Figure 2 for Unsupervised Fact Verification by Language Model Distillation

Figure 3 for Unsupervised Fact Verification by Language Model Distillation

Figure 4 for Unsupervised Fact Verification by Language Model Distillation

Abstract:Unsupervised fact verification aims to verify a claim using evidence from a trustworthy knowledge base without any kind of data annotation. To address this challenge, algorithms must produce features for every claim that are both semantically meaningful, and compact enough to find a semantic alignment with the source information. In contrast to previous work, which tackled the alignment problem by learning over annotated corpora of claims and their corresponding labels, we propose SFAVEL (Self-supervised Fact Verification via Language Model Distillation), a novel unsupervised framework that leverages pre-trained language models to distil self-supervised features into high-quality claim-fact alignments without the need for annotations. This is enabled by a novel contrastive loss function that encourages features to attain high-quality claim and evidence alignments whilst preserving the semantic relationships across the corpora. Notably, we present results that achieve a new state-of-the-art on the standard FEVER fact verification benchmark (+8% accuracy) with linear evaluation.

Via

Access Paper or Ask Questions

Translating synthetic natural language to database queries: a polyglot deep learning framework

Apr 14, 2021

Adrián Bazaga, Nupur Gunwant, Gos Micklem

Figure 1 for Translating synthetic natural language to database queries: a polyglot deep learning framework

Figure 2 for Translating synthetic natural language to database queries: a polyglot deep learning framework

Figure 3 for Translating synthetic natural language to database queries: a polyglot deep learning framework

Figure 4 for Translating synthetic natural language to database queries: a polyglot deep learning framework

Abstract:The number of databases as well as their size and complexity is increasing. This creates a barrier to use especially for non-experts, who have to come to grips with the nature of the data, the way it has been represented in the database, and the specific query languages or user interfaces by which data are accessed. These difficulties worsen in research settings, where it is common to work with many different databases. One approach to improving this situation is to allow users to pose their queries in natural language. In this work we describe a machine learning framework, Polyglotter, that in a general way supports the mapping of natural language searches to database queries. Importantly, it does not require the creation of manually annotated data for training and therefore can be applied easily to multiple domains. The framework is polyglot in the sense that it supports multiple different database engines that are accessed with a variety of query languages, including SQL and Cypher. Furthermore Polyglotter also supports multi-class queries. Our results indicate that our framework performs well on both synthetic and real databases, and may provide opportunities for database maintainers to improve accessibility to their resources.

Via

Access Paper or Ask Questions

A Convolutional Neural Network for the Automatic Diagnosis of Collagen VI related Muscular Dystrophies

Jan 30, 2019

Adrián Bazaga, Mònica Roldán, Carmen Badosa, Cecilia Jiménez-Mallebrera, Josep M. Porta

Figure 1 for A Convolutional Neural Network for the Automatic Diagnosis of Collagen VI related Muscular Dystrophies

Figure 2 for A Convolutional Neural Network for the Automatic Diagnosis of Collagen VI related Muscular Dystrophies

Figure 3 for A Convolutional Neural Network for the Automatic Diagnosis of Collagen VI related Muscular Dystrophies

Figure 4 for A Convolutional Neural Network for the Automatic Diagnosis of Collagen VI related Muscular Dystrophies

Abstract:The development of machine learning systems for the diagnosis of rare diseases is challenging mainly due the lack of data to study them. Despite this challenge, this paper proposes a system for the Computer Aided Diagnosis (CAD) of low-prevalence, congenital muscular dystrophies from confocal microscopy images. The proposed CAD system relies on a Convolutional Neural Network (CNN) which performs an independent classification for non-overlapping patches tiling the input image, and generates an overall decision summarizing the individual decisions for the patches on the query image. This decision scheme points to the possibly problematic areas in the input images and provides a global quantitative evaluation of the state of the patients, which is fundamental for diagnosis and to monitor the efficiency of therapies.

* Submitted for review to Expert Systems With Applications

Via

Access Paper or Ask Questions