Alert button
Picture for Maria Liakata

Maria Liakata

Alert button

Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia

Oct 15, 2023
Dimitris Gkoumas, Matthew Purver, Maria Liakata

Figure 1 for Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia
Figure 2 for Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia
Figure 3 for Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia
Figure 4 for Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia

Dementia is associated with language disorders which impede communication. Here, we automatically learn linguistic disorder patterns by making use of a moderately-sized pre-trained language model and forcing it to focus on reformulated natural language processing (NLP) tasks and associated linguistic patterns. Our experiments show that NLP tasks that encapsulate contextual information and enhance the gradient signal with linguistic patterns benefit performance. We then use the probability estimates from the best model to construct digital linguistic markers measuring the overall quality in communication and the intensity of a variety of language disorders. We investigate how the digital markers characterize dementia speech from a longitudinal perspective. We find that our proposed communication marker is able to robustly and reliably characterize the language of people with dementia, outperforming existing linguistic approaches; and shows external validity via significant correlation with clinical markers of behaviour. Finally, our proposed linguistic disorder markers provide useful insights into gradual language impairment associated with disease progression.

* It has been accepted to appear at EMNLP23 
Viaarxiv icon

A Digital Language Coherence Marker for Monitoring Dementia

Oct 14, 2023
Dimitris Gkoumas, Adam Tsakalidis, Maria Liakata

Figure 1 for A Digital Language Coherence Marker for Monitoring Dementia
Figure 2 for A Digital Language Coherence Marker for Monitoring Dementia
Figure 3 for A Digital Language Coherence Marker for Monitoring Dementia
Figure 4 for A Digital Language Coherence Marker for Monitoring Dementia

The use of spontaneous language to derive appropriate digital markers has become an emergent, promising and non-intrusive method to diagnose and monitor dementia. Here we propose methods to capture language coherence as a cost-effective, human-interpretable digital marker for monitoring cognitive changes in people with dementia. We introduce a novel task to learn the temporal logical consistency of utterances in short transcribed narratives and investigate a range of neural approaches. We compare such language coherence patterns between people with dementia and healthy controls and conduct a longitudinal evaluation against three clinical bio-markers to investigate the reliability of our proposed digital coherence marker. The coherence marker shows a significant difference between people with mild cognitive impairment, those with Alzheimer's Disease and healthy controls. Moreover our analysis shows high association between the coherence marker and the clinical bio-markers as well as generalisability potential to other related conditions.

* It has been accepted to appear at EMNLP23 
Viaarxiv icon

Automated clinical coding using off-the-shelf large language models

Oct 10, 2023
Joseph S. Boyle, Antanas Kascenas, Pat Lok, Maria Liakata, Alison Q. O'Neil

The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders. Efforts towards automated ICD coding are dominated by supervised deep learning models. However, difficulties in learning to predict the large number of rare codes remain a barrier to adoption in clinical practice. In this work, we leverage off-the-shelf pre-trained generative large language models (LLMs) to develop a practical solution that is suitable for zero-shot and few-shot code assignment. Unsupervised pre-training alone does not guarantee precise knowledge of the ICD ontology and specialist clinical coding task, therefore we frame the task as information extraction, providing a description of each coded concept and asking the model to retrieve related mentions. For efficiency, rather than iterating over all codes, we leverage the hierarchical nature of the ICD ontology to sparsely search for relevant codes. Then, in a second stage, which we term 'meta-refinement', we utilise GPT-4 to select a subset of the relevant labels as predictions. We validate our method using Llama-2, GPT-3.5 and GPT-4 on the CodiEsp dataset of ICD-coded clinical case documents. Our tree-search method achieves state-of-the-art performance on rarer classes, achieving the best macro-F1 of 0.225, whilst achieving slightly lower micro-F1 of 0.157, compared to 0.216 and 0.219 respectively from PLM-ICD. To the best of our knowledge, this is the first method for automated ICD coding requiring no task-specific learning.

* 9 pages, 4 figures 
Viaarxiv icon

Creation and evaluation of timelines for longitudinal user posts

Mar 10, 2023
Anthony Hills, Adam Tsakalidis, Federico Nanni, Ioannis Zachos, Maria Liakata

Figure 1 for Creation and evaluation of timelines for longitudinal user posts
Figure 2 for Creation and evaluation of timelines for longitudinal user posts
Figure 3 for Creation and evaluation of timelines for longitudinal user posts
Figure 4 for Creation and evaluation of timelines for longitudinal user posts

There is increasing interest to work with user generated content in social media, especially textual posts over time. Currently there is no consistent way of segmenting user posts into timelines in a meaningful way that improves the quality and cost of manual annotation. Here we propose a set of methods for segmenting longitudinal user posts into timelines likely to contain interesting moments of change in a user's behaviour, based on their online posting activity. We also propose a novel framework for evaluating timelines and show its applicability in the context of two different social media datasets. Finally, we present a discussion of the linguistic content of highly ranked timelines.

* Accepted at EACL 2023 (main, long); camera-ready version 
Viaarxiv icon

PANACEA: An Automated Misinformation Detection System on COVID-19

Feb 28, 2023
Runcong Zhao, Miguel Arana-Catania, Lixing Zhu, Elena Kochkina, Lin Gui, Arkaitz Zubiaga, Rob Procter, Maria Liakata, Yulan He

Figure 1 for PANACEA: An Automated Misinformation Detection System on COVID-19
Figure 2 for PANACEA: An Automated Misinformation Detection System on COVID-19
Figure 3 for PANACEA: An Automated Misinformation Detection System on COVID-19
Figure 4 for PANACEA: An Automated Misinformation Detection System on COVID-19

In this demo, we introduce a web-based misinformation detection system PANACEA on COVID-19 related claims, which has two modules, fact-checking and rumour detection. Our fact-checking module, which is supported by novel natural language inference methods with a self-attention network, outperforms state-of-the-art approaches. It is also able to give automated veracity assessment and ranked supporting evidence with the stance towards the claim to be checked. In addition, PANACEA adapts the bi-directional graph convolutional networks model, which is able to detect rumours based on comment networks of related tweets, instead of relying on the knowledge base. This rumour detection module assists by warning the users in the early stages when a knowledge base may not be available.

Viaarxiv icon

A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering

Nov 30, 2022
Matthew Maufe, James Ravenscroft, Rob Procter, Maria Liakata

Figure 1 for A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering
Figure 2 for A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering
Figure 3 for A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering
Figure 4 for A Pipeline for Generating, Annotating and Employing Synthetic Data for Real World Question Answering

Question Answering (QA) is a growing area of research, often used to facilitate the extraction of information from within documents. State-of-the-art QA models are usually pre-trained on domain-general corpora like Wikipedia and thus tend to struggle on out-of-domain documents without fine-tuning. We demonstrate that synthetic domain-specific datasets can be generated easily using domain-general models, while still providing significant improvements to QA performance. We present two new tools for this task: A flexible pipeline for validating the synthetic QA data and training downstream models on it, and an online interface to facilitate human annotation of this generated data. Using this interface, crowdworkers labelled 1117 synthetic QA pairs, which we then used to fine-tune downstream models and improve domain-specific QA performance by 8.75 F1.

* To be published in the companion proceedings of EMNLP 2022. 17 pages (11 of which are in the appendix), 7 figures (3 of which are in the appendix) 
Viaarxiv icon

Unsupervised Opinion Summarisation in the Wasserstein Space

Nov 27, 2022
Jiayu Song, Iman Munire Bilal, Adam Tsakalidis, Rob Procter, Maria Liakata

Figure 1 for Unsupervised Opinion Summarisation in the Wasserstein Space
Figure 2 for Unsupervised Opinion Summarisation in the Wasserstein Space
Figure 3 for Unsupervised Opinion Summarisation in the Wasserstein Space
Figure 4 for Unsupervised Opinion Summarisation in the Wasserstein Space

Opinion summarisation synthesises opinions expressed in a group of documents discussing the same topic to produce a single summary. Recent work has looked at opinion summarisation of clusters of social media posts. Such posts are noisy and have unpredictable structure, posing additional challenges for the construction of the summary distribution and the preservation of meaning compared to online reviews, which has been so far the focus of opinion summarisation. To address these challenges we present \textit{WassOS}, an unsupervised abstractive summarization model which makes use of the Wasserstein distance. A Variational Autoencoder is used to get the distribution of documents/posts, and the distributions are disentangled into separate semantic and syntactic spaces. The summary distribution is obtained using the Wasserstein barycenter of the semantic and syntactic distributions. A latent variable sampled from the summary distribution is fed into a GRU decoder with a transformer layer to produce the final summary. Our experiments on multiple datasets including Twitter clusters, Reddit threads, and reviews show that WassOS almost always outperforms the state-of-the-art on ROUGE metrics and consistently produces the best summaries with respect to meaning preservation according to human evaluations.

Viaarxiv icon

Template-based Abstractive Microblog Opinion Summarisation

Aug 08, 2022
Iman Munire Bilal, Bo Wang, Adam Tsakalidis, Dong Nguyen, Rob Procter, Maria Liakata

Figure 1 for Template-based Abstractive Microblog Opinion Summarisation
Figure 2 for Template-based Abstractive Microblog Opinion Summarisation
Figure 3 for Template-based Abstractive Microblog Opinion Summarisation
Figure 4 for Template-based Abstractive Microblog Opinion Summarisation

We introduce the task of microblog opinion summarisation (MOS) and share a dataset of 3100 gold-standard opinion summaries to facilitate research in this domain. The dataset contains summaries of tweets spanning a 2-year period and covers more topics than any other public Twitter summarisation dataset. Summaries are abstractive in nature and have been created by journalists skilled in summarising news articles following a template separating factual information (main story) from author opinions. Our method differs from previous work on generating gold-standard summaries from social media, which usually involves selecting representative posts and thus favours extractive summarisation models. To showcase the dataset's utility and challenges, we benchmark a range of abstractive and extractive state-of-the-art summarisation models and achieve good performance, with the former outperforming the latter. We also show that fine-tuning is necessary to improve performance and investigate the benefits of using different sample sizes.

* Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2022. Pre-MIT Press publication version 
Viaarxiv icon

Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio

Jul 29, 2022
Alejo Nevado-Holgado, Colin Espie, Maria Liakata, Alasdair Henry, Jenny Gu, Niall Taylor, Kate Saunders, Tom Walker, Chris Miller

Figure 1 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio
Figure 2 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio
Figure 3 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio
Figure 4 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio

SleepioTM is a digital mobile phone and web platform that uses techniques from cognitive behavioural therapy (CBT) to improve sleep in people with sleep difficulty. As part of this process, Sleepio captures data about the sleep behaviour of the users that have consented to such data being processed. For neural networks, the scale of the data is an opportunity to train meaningful models translatable to actual clinical practice. In collaboration with Big Health, the therapeutics company that created and utilizes Sleepio, we have analysed data from a random sample of 401,174 sleep diaries and built a neural network to model sleep behaviour and sleep quality of each individual in a personalised manner. We demonstrate that this neural network is more accurate than standard statistical methods in predicting the sleep quality of an individual based on his/her behaviour from the last 10 days. We compare model performance in a wide range of hyperparameter settings representing various scenarios. We further show that the neural network can be used to produce personalised recommendations of what sleep habits users should follow to maximise sleep quality, and show that these recommendations are substantially better than the ones generated by standard methods. We finally show that the neural network can explain the recommendation given to each participant and calculate confidence intervals for each prediction, all of which are essential for clinicians to be able to adopt such a tool in clinical practice.

Viaarxiv icon

PHEMEPlus: Enriching Social Media Rumour Verification with External Evidence

Jul 28, 2022
John Dougrez-Lewis, Elena Kochkina, M. Arana-Catania, Maria Liakata, Yulan He

Figure 1 for PHEMEPlus: Enriching Social Media Rumour Verification with External Evidence
Figure 2 for PHEMEPlus: Enriching Social Media Rumour Verification with External Evidence
Figure 3 for PHEMEPlus: Enriching Social Media Rumour Verification with External Evidence
Figure 4 for PHEMEPlus: Enriching Social Media Rumour Verification with External Evidence

Work on social media rumour verification utilises signals from posts, their propagation and users involved. Other lines of work target identifying and fact-checking claims based on information from Wikipedia, or trustworthy news articles without considering social media context. However works combining the information from social media with external evidence from the wider web are lacking. To facilitate research in this direction, we release a novel dataset, PHEMEPlus, an extension of the PHEME benchmark, which contains social media conversations as well as relevant external evidence for each rumour. We demonstrate the effectiveness of incorporating such evidence in improving rumour verification models. Additionally, as part of the evidence collection, we evaluate various ways of query formulation to identify the most effective method.

* 10 pages, 1 figure, 5 tables, presented in the Fifth Fact Extraction and VERification Workshop (FEVER). 2022 
Viaarxiv icon