A knowledge gap persists between Machine Learning (ML) developers (e.g., data scientists) and practitioners (e.g., clinicians), hampering the full utilization of ML for clinical data analysis. We investigated the potential of the chatGPT Advanced Data Analysis (ADA), an extension of GPT-4, to bridge this gap and perform ML analyses efficiently. Real-world clinical datasets and study details from large trials across various medical specialties were presented to chatGPT ADA without specific guidance. ChatGPT ADA autonomously developed state-of-the-art ML models based on the original study's training data to predict clinical outcomes such as cancer development, cancer progression, disease complications, or biomarkers such as pathogenic gene sequences. Strikingly, these ML models matched or outperformed their published counterparts. We conclude that chatGPT ADA offers a promising avenue to democratize ML in medicine, making advanced analytics accessible to non-ML experts and promoting broader applications in medical research and practice.
Developing robust and effective artificial intelligence (AI) models in medicine requires access to large amounts of patient data. The use of AI models solely trained on large multi-institutional datasets can help with this, yet the imperative to ensure data privacy remains, particularly as membership inference risks breaching patient confidentiality. As a proposed remedy, we advocate for the integration of differential privacy (DP). We specifically investigate the performance of models trained with DP as compared to models trained without DP on data from institutions that the model had not seen during its training (i.e., external validation) - the situation that is reflective of the clinical use of AI models. By leveraging more than 590,000 chest radiographs from five institutions, we evaluated the efficacy of DP-enhanced domain transfer (DP-DT) in diagnosing cardiomegaly, pleural effusion, pneumonia, atelectasis, and in identifying healthy subjects. We juxtaposed DP-DT with non-DP-DT and examined diagnostic accuracy and demographic fairness using the area under the receiver operating characteristic curve (AUC) as the main metric, as well as accuracy, sensitivity, and specificity. Our results show that DP-DT, even with exceptionally high privacy levels (epsilon around 1), performs comparably to non-DP-DT (P>0.119 across all domains). Furthermore, DP-DT led to marginal AUC differences - less than 1% - for nearly all subgroups, relative to non-DP-DT. Despite consistent evidence suggesting that DP models induce significant performance degradation for on-domain applications, we show that off-domain performance is almost not affected. Therefore, we ardently advocate for the adoption of DP in training diagnostic medical AI models, given its minimal impact on performance.
Twitter sentiment analysis, which often focuses on predicting the polarity of tweets, has attracted increasing attention over the last years, in particular with the rise of deep learning (DL). In this paper, we propose a new task: predicting the predominant sentiment among (first-order) replies to a given tweet. Therefore, we created RETWEET, a large dataset of tweets and replies manually annotated with sentiment labels. As a strong baseline, we propose a two-stage DL-based method: first, we create automatically labeled training data by applying a standard sentiment classifier to tweet replies and aggregating its predictions for each original tweet; our rationale is that individual errors made by the classifier are likely to cancel out in the aggregation step. Second, we use the automatically labeled data for supervised training of a neural network to predict reply sentiment from the original tweets. The resulting classifier is evaluated on the new RETWEET dataset, showing promising results, especially considering that it has been trained without any manually labeled data. Both the dataset and the baseline implementation are publicly available.
During the strip rolling process, a considerable amount of the forces of the material pressure cause elastic deformation on the work-roll, i.e., the deflection process. The uncontrollable amount of the work-roll deflection leads to the high deviations in the permissible thickness of the plate along its width. In the context of the Austenitic Stainless Steels (ASS), due to the instability of the Austenite phase in a cold temperature, cold deformation leads to the production of Strain-Induced Martensite (SIM), which improves the mechanical properties. It leads to the hardening of the ASS 316L during the cold deformation, which causes the Strain-Stress curve of the ASS 316L to behave non-linearly, which distinguishes it from other categories of steels. To account for this phenomenon, we propose to utilize a Machine Learning (ML) method to predict more accurately the flow stress of the ASS 316L during the cold rolling. Furthermore, we conduct various mechanical tensile tests in order to obtain the required dataset, Stress316L, for training the neural network. Moreover, instead of using a constant value of flow stress during the multi-pass rolling process, we use a Finite Difference (FD) formulation of the equilibrium equation in order to account for the dynamic behavior of the flow stress, which leads to the estimation of the mean pressure, which the strip enforces to the rolls during deformation. Finally, using the Finite Element Analysis (FEA), the deflection of the work-roll tools will be calculated. As a result, we end up with a generalized model for the calculation of the roll deflection, specific to the ASS 316L. To the best of our knowledge, this is the first model for ASS 316L which considers dynamic flow stress and SIM of the rolled plate, using FEM and an ML approach, which could contribute to the better design of the tolls.