Chronic diseases such as diabetes are the leading causes of morbidity and mortality worldwide. Numerous research studies have been attempted with various deep learning models in diagnosis. However, most previous studies had certain limitations, including using publicly available datasets (e.g. MIMIC), and imbalanced data. In this study, we collected five-year electronic health records (EHRs) from the Taiwan hospital database, including 1,420,596 clinical notes, 387,392 laboratory test results, and more than 1,505 laboratory test items, focusing on research pre-training large language models. We proposed a novel Large Language Multimodal Models (LLMMs) framework incorporating multimodal data from clinical notes and laboratory test results for the prediction of chronic disease risk. Our method combined a text embedding encoder and multi-head attention layer to learn laboratory test values, utilizing a deep neural network (DNN) module to merge blood features with chronic disease semantics into a latent space. In our experiments, we observe that clinicalBERT and PubMed-BERT, when combined with attention fusion, can achieve an accuracy of 73% in multiclass chronic diseases and diabetes prediction. By transforming laboratory test values into textual descriptions and employing the Flan T-5 model, we achieved a 76% Area Under the ROC Curve (AUROC), demonstrating the effectiveness of leveraging numerical text data for training and inference in language models. This approach significantly improves the accuracy of early-stage diabetes prediction.
The global COVID-19 pandemic has caused more than six million deaths worldwide. Medicalized hotels were established in Taiwan as quarantine facilities for COVID-19 patients with no or mild symptoms. Due to limited medical care available at these hotels, it is of paramount importance to identify patients at risk of clinical deterioration. This study aimed to develop and evaluate a graph-based deep learning approach for progressive hospital transfer risk prediction in a medicalized hotel setting. Vital sign measurements were obtained for 632 patients and daily patient similarity graphs were constructed. Inductive graph convolutional network models were trained on top of the temporally integrated graphs to predict hospital transfer risk. The proposed models achieved AUC scores above 0.83 for hospital transfer risk prediction based on the measurements of past 1, 2, and 3 days, outperforming baseline machine learning methods. A post-hoc analysis on the constructed diffusion-based graph using Local Clustering Coefficient discovered a high-risk cluster with significantly older mean age, higher body temperature, lower SpO2, and shorter length of stay. Further time-to-hospital-transfer survival analysis also revealed a significant decrease in survival probability in the discovered high-risk cluster. The obtained results demonstrated promising predictability and interpretability of the proposed graph-based approach. This technique may help preemptively detect high-risk patients at community-based medical facilities similar to a medicalized hotel.