Background: Deep neural networks have proven to be powerful computational tools for modeling, prediction, and generation. However, the workings of these models have generally been opaque. Recent work has shown that the performance of some models are modulated by overlapping functional networks of connections within the models. Here the techniques of functional neuroimaging are applied to an exemplary large language model to probe its functional structure. Methods: A series of block-designed task-based prompt sequences were generated to probe the Facebook Galactica-125M model. Tasks included prompts relating to political science, medical imaging, paleontology, archeology, pathology, and random strings presented in an off/on/off pattern with prompts about other random topics. For the generation of each output token, all layer output values were saved to create an effective time series. General linear models were fit to the data to identify layer output values which were active with the tasks. Results: Distinct, overlapping networks were identified with each task. Most overlap was observed between medical imaging and pathology networks. These networks were repeatable across repeated performance of related tasks, and correspondence of identified functional networks and activation in tasks not used to define the functional networks was shown to accurately identify the presented task. Conclusion: The techniques of functional neuroimaging can be applied to deep neural networks as a means to probe their workings. Identified functional networks hold the potential for use in model alignment, modulation of model output, and identifying weights to target in fine-tuning.
Purpose: This study evaluates the effectiveness and impact of automated order-based protocol assignment for magnetic resonance imaging (MRI) exams using natural language processing (NLP) and deep learning (DL). Methods: NLP tools were applied to retrospectively process orders from over 116,000 MRI exams with 200 unique sub-specialized protocols ("Local" protocol class). Separate DL models were trained on 70\% of the processed data for "Local" protocols as well as 93 American College of Radiology ("ACR") protocols and 48 "General" protocols. The DL Models were assessed in an "auto-protocoling (AP)" inference mode which returns the top recommendation and in a "clinical decision support (CDS)" inference mode which returns up to 10 protocols for radiologist review. The accuracy of each protocol recommendation was computed and analyzed based on the difference between the normalized output score of the corresponding neural net for the top two recommendations. Results: The top predicted protocol in AP mode was correct for 82.8%, 73.8%, and 69.3% of the test cases for "General", "ACR", and "Local" protocol classes, respectively. Higher levels of accuracy over 96% were obtained for all protocol classes in CDS mode. However, at current validation performance levels, the proposed models offer modest, positive, financial impact on large-scale imaging networks. Conclusions: DL-based protocol automation is feasible and can be tuned to route substantial fractions of exams for auto-protocoling, with higher accuracy with more general protocols. Economic analyses of the tested algorithms indicate that improved algorithm performance is required to yield a practical exam auto-protocoling tool for sub-specialized imaging exams.