Alert button
Picture for Oladimeji Farri

Oladimeji Farri

Alert button

General-Purpose vs. Domain-Adapted Large Language Models for Extraction of Data from Thoracic Radiology Reports

Nov 28, 2023
Ali H. Dhanaliwala, Rikhiya Ghosh, Sanjeev Kumar Karn, Poikavila Ullaskrishnan, Oladimeji Farri, Dorin Comaniciu, Charles E. Kahn

Radiologists produce unstructured data that could be valuable for clinical care when consumed by information systems. However, variability in style limits usage. Study compares performance of system using domain-adapted language model (RadLing) and general-purpose large language model (GPT-4) in extracting common data elements (CDE) from thoracic radiology reports. Three radiologists annotated a retrospective dataset of 1300 thoracic reports (900 training, 400 test) and mapped to 21 pre-selected relevant CDEs. RadLing was used to generate embeddings for sentences and identify CDEs using cosine-similarity, which were mapped to values using light-weight mapper. GPT-4 system used OpenAI's general-purpose embeddings to identify relevant CDEs and used GPT-4 to map to values. The output CDE:value pairs were compared to the reference standard; an identical match was considered true positive. Precision (positive predictive value) was 96% (2700/2824) for RadLing and 99% (2034/2047) for GPT-4. Recall (sensitivity) was 94% (2700/2876) for RadLing and 70% (2034/2887) for GPT-4; the difference was statistically significant (P<.001). RadLing's domain-adapted embeddings were more sensitive in CDE identification (95% vs 71%) and its light-weight mapper had comparable precision in value assignment (95.4% vs 95.0%). RadLing system exhibited higher performance than GPT-4 system in extracting CDEs from radiology reports. RadLing system's domain-adapted embeddings outperform general-purpose embeddings from OpenAI in CDE identification and its light-weight value mapper achieves comparable precision to large GPT-4. RadLing system offers operational advantages including local deployment and reduced runtime costs. Domain-adapted RadLing system surpasses GPT-4 system in extracting common data elements from radiology reports, while providing benefits of local deployment and lower costs.

Viaarxiv icon

Generation of Radiology Findings in Chest X-Ray by Leveraging Collaborative Knowledge

Jun 18, 2023
Manuela Daniela Danu, George Marica, Sanjeev Kumar Karn, Bogdan Georgescu, Awais Mansoor, Florin Ghesu, Lucian Mihai Itu, Constantin Suciu, Sasa Grbic, Oladimeji Farri, Dorin Comaniciu

Figure 1 for Generation of Radiology Findings in Chest X-Ray by Leveraging Collaborative Knowledge
Figure 2 for Generation of Radiology Findings in Chest X-Ray by Leveraging Collaborative Knowledge
Figure 3 for Generation of Radiology Findings in Chest X-Ray by Leveraging Collaborative Knowledge
Figure 4 for Generation of Radiology Findings in Chest X-Ray by Leveraging Collaborative Knowledge

Among all the sub-sections in a typical radiology report, the Clinical Indications, Findings, and Impression often reflect important details about the health status of a patient. The information included in Impression is also often covered in Findings. While Findings and Impression can be deduced by inspecting the image, Clinical Indications often require additional context. The cognitive task of interpreting medical images remains the most critical and often time-consuming step in the radiology workflow. Instead of generating an end-to-end radiology report, in this paper, we focus on generating the Findings from automated interpretation of medical images, specifically chest X-rays (CXRs). Thus, this work focuses on reducing the workload of radiologists who spend most of their time either writing or narrating the Findings. Unlike past research, which addresses radiology report generation as a single-step image captioning task, we have further taken into consideration the complexity of interpreting CXR images and propose a two-step approach: (a) detecting the regions with abnormalities in the image, and (b) generating relevant text for regions with abnormalities by employing a generative large language model (LLM). This two-step approach introduces a layer of interpretability and aligns the framework with the systematic reasoning that radiologists use when reviewing a CXR.

* Information Technology and Quantitative Management (ITQM 2023  
* Information Technology and Quantitative Management (ITQM 2023) 
Viaarxiv icon

shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation

Jun 05, 2023
Sanjeev Kumar Karn, Rikhiya Ghosh, Kusuma P, Oladimeji Farri

Figure 1 for shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation
Figure 2 for shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation
Figure 3 for shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation
Figure 4 for shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation

Instruction-tuned generative Large language models (LLMs) like ChatGPT and Bloomz possess excellent generalization abilities, but they face limitations in understanding radiology reports, particularly in the task of generating the IMPRESSIONS section from the FINDINGS section. They tend to generate either verbose or incomplete IMPRESSIONS, mainly due to insufficient exposure to medical text data during training. We present a system which leverages large-scale medical text data for domain-adaptive pre-training of instruction-tuned LLMs to enhance its medical knowledge and performance on specific medical tasks. We show that this system performs better in a zero-shot setting than a number of pretrain-and-finetune adaptation methods on the IMPRESSIONS generation task, and ranks 1st among participating systems in Task 1B: Radiology Report Summarization at the BioNLP 2023 workshop.

* BioNLP 2023, Co-located with ACL 2023  
* 1st Place in Task 1B: Radiology Report Summarization at BioNLP 2023 
Viaarxiv icon

RadLing: Towards Efficient Radiology Report Understanding

Jun 04, 2023
Rikhiya Ghosh, Sanjeev Kumar Karn, Manuela Daniela Danu, Larisa Micu, Ramya Vunikili, Oladimeji Farri

Figure 1 for RadLing: Towards Efficient Radiology Report Understanding
Figure 2 for RadLing: Towards Efficient Radiology Report Understanding
Figure 3 for RadLing: Towards Efficient Radiology Report Understanding
Figure 4 for RadLing: Towards Efficient Radiology Report Understanding

Most natural language tasks in the radiology domain use language models pre-trained on biomedical corpus. There are few pretrained language models trained specifically for radiology, and fewer still that have been trained in a low data setting and gone on to produce comparable results in fine-tuning tasks. We present RadLing, a continuously pretrained language model using Electra-small (Clark et al., 2020) architecture, trained using over 500K radiology reports, that can compete with state-of-the-art results for fine tuning tasks in radiology domain. Our main contribution in this paper is knowledge-aware masking which is a taxonomic knowledge-assisted pretraining task that dynamically masks tokens to inject knowledge during pretraining. In addition, we also introduce an knowledge base-aided vocabulary extension to adapt the general tokenization vocabulary to radiology domain.

* 61st Annual Meeting of the Association for Computational Linguistics (ACL), July 9-14, 2023, Toronto, Canada  
* Association for Computational Linguistics (ACL), 2023 
Viaarxiv icon

Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization

Mar 15, 2022
Sanjeev Kumar Karn, Ning Liu, Hinrich Schuetze, Oladimeji Farri

Figure 1 for Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization
Figure 2 for Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization
Figure 3 for Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization
Figure 4 for Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization

The IMPRESSIONS section of a radiology report about an imaging study is a summary of the radiologist's reasoning and conclusions, and it also aids the referring physician in confirming or excluding certain diagnoses. A cascade of tasks are required to automatically generate an abstractive summary of the typical information-rich radiology report. These tasks include acquisition of salient content from the report and generation of a concise, easily consumable IMPRESSIONS section. Prior research on radiology report summarization has focused on single-step end-to-end models -- which subsume the task of salient content acquisition. To fully explore the cascade structure and explainability of radiology report summarization, we introduce two innovations. First, we design a two-step approach: extractive summarization followed by abstractive summarization. Second, we additionally break down the extractive part into two independent tasks: extraction of salient (1) sentences and (2) keywords. Experiments on a publicly available radiology report dataset show our novel approach leads to a more precise summary compared to single-step and to two-step-with-single-extractive-process baselines with an overall improvement in F1 score Of 3-4%.

* 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland, 2022  
* Accepted at 60th Annual Meeting of the Association for Computational Linguistics 2022 Main Conference 
Viaarxiv icon

Assertion Detection in Multi-Label Clinical Text using Scope Localization

May 19, 2020
Rajeev Bhatt Ambati, Ahmed Ada Hanifi, Ramya Vunikili, Puneet Sharma, Oladimeji Farri

Figure 1 for Assertion Detection in Multi-Label Clinical Text using Scope Localization
Figure 2 for Assertion Detection in Multi-Label Clinical Text using Scope Localization
Figure 3 for Assertion Detection in Multi-Label Clinical Text using Scope Localization
Figure 4 for Assertion Detection in Multi-Label Clinical Text using Scope Localization

Multi-label sentences (text) in the clinical domain result from the rich description of scenarios during patient care. The state-of-theart methods for assertion detection mostly address this task in the setting of a single assertion label per sentence (text). In addition, few rules based and deep learning methods perform negation/assertion scope detection on single-label text. It is a significant challenge extending these methods to address multi-label sentences without diminishing performance. Therefore, we developed a convolutional neural network (CNN) architecture to localize multiple labels and their scopes in a single stage end-to-end fashion, and demonstrate that our model performs atleast 12% better than the state-of-the-art on multi-label clinical text.

Viaarxiv icon

DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference

Apr 11, 2018
Reza Ghaeini, Sadid A. Hasan, Vivek Datla, Joey Liu, Kathy Lee, Ashequl Qadir, Yuan Ling, Aaditya Prakash, Xiaoli Z. Fern, Oladimeji Farri

Figure 1 for DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
Figure 2 for DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
Figure 3 for DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
Figure 4 for DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference

We present a novel deep learning architecture to address the natural language inference (NLI) task. Existing approaches mostly rely on simple reading mechanisms for independent encoding of the premise and hypothesis. Instead, we propose a novel dependent reading bidirectional LSTM network (DR-BiLSTM) to efficiently model the relationship between a premise and a hypothesis during encoding and inference. We also introduce a sophisticated ensemble strategy to combine our proposed models, which noticeably improves final predictions. Finally, we demonstrate how the results can be improved further with an additional preprocessing step. Our evaluation shows that DR-BiLSTM obtains the best single model and ensemble model results achieving the new state-of-the-art scores on the Stanford NLI dataset.

* 18 pages, Accepted as a long paper at NAACL HLT 2018 
Viaarxiv icon

Condensed Memory Networks for Clinical Diagnostic Inferencing

Jan 03, 2017
Aaditya Prakash, Siyuan Zhao, Sadid A. Hasan, Vivek Datla, Kathy Lee, Ashequl Qadir, Joey Liu, Oladimeji Farri

Figure 1 for Condensed Memory Networks for Clinical Diagnostic Inferencing
Figure 2 for Condensed Memory Networks for Clinical Diagnostic Inferencing
Figure 3 for Condensed Memory Networks for Clinical Diagnostic Inferencing
Figure 4 for Condensed Memory Networks for Clinical Diagnostic Inferencing

Diagnosis of a clinical condition is a challenging task, which often requires significant medical investigation. Previous work related to diagnostic inferencing problems mostly consider multivariate observational data (e.g. physiological signals, lab tests etc.). In contrast, we explore the problem using free-text medical notes recorded in an electronic health record (EHR). Complex tasks like these can benefit from structured knowledge bases, but those are not scalable. We instead exploit raw text from Wikipedia as a knowledge source. Memory networks have been demonstrated to be effective in tasks which require comprehension of free-form text. They use the final iteration of the learned representation to predict probable classes. We introduce condensed memory neural networks (C-MemNNs), a novel model with iterative condensation of memory representations that preserves the hierarchy of features in the memory. Experiments on the MIMIC-III dataset show that the proposed model outperforms other variants of memory networks to predict the most probable diagnoses given a complex clinical scenario.

* Accepted to AAAI 2017 
Viaarxiv icon

Neural Paraphrase Generation with Stacked Residual LSTM Networks

Oct 13, 2016
Aaditya Prakash, Sadid A. Hasan, Kathy Lee, Vivek Datla, Ashequl Qadir, Joey Liu, Oladimeji Farri

Figure 1 for Neural Paraphrase Generation with Stacked Residual LSTM Networks
Figure 2 for Neural Paraphrase Generation with Stacked Residual LSTM Networks
Figure 3 for Neural Paraphrase Generation with Stacked Residual LSTM Networks
Figure 4 for Neural Paraphrase Generation with Stacked Residual LSTM Networks

In this paper, we propose a novel neural approach for paraphrase generation. Conventional para- phrase generation methods either leverage hand-written rules and thesauri-based alignments, or use statistical machine learning principles. To the best of our knowledge, this work is the first to explore deep learning models for paraphrase generation. Our primary contribution is a stacked residual LSTM network, where we add residual connections between LSTM layers. This allows for efficient training of deep LSTMs. We evaluate our model and other state-of-the-art deep learning models on three different datasets: PPDB, WikiAnswers and MSCOCO. Evaluation results demonstrate that our model outperforms sequence to sequence, attention-based and bi- directional LSTM models on BLEU, METEOR, TER and an embedding-based sentence similarity metric.

* COLING 2016 
Viaarxiv icon