Abstract:Large Language Models (LLMs) are increasingly being leveraged for generating and translating scientific computer codes by both domain-experts and non-domain experts. Fortran has served as one of the go to programming languages in legacy high-performance computing (HPC) for scientific discoveries. Despite growing adoption, LLM-based code translation of legacy code-bases has not been thoroughly assessed or quantified for its usability. Here, we studied the applicability of LLM-based translation of Fortran to C++ as a step towards building an agentic-workflow using open-weight LLMs on two different computational platforms. We statistically quantified the compilation accuracy of the translated C++ codes, measured the similarity of the LLM translated code to the human translated C++ code, and statistically quantified the output similarity of the Fortran to C++ translation.
Abstract:The nonlinearity of activation functions used in deep learning models are crucial for the success of predictive models. There are several commonly used simple nonlinear functions, including Rectified Linear Unit (ReLU) and Leaky-ReLU (L-ReLU). In practice, these functions remarkably enhance the model accuracy. However, there is limited insight into the functionality of these nonlinear activation functions in terms of why certain models perform better than others. Here, we investigate the model performance when using ReLU or L-ReLU as activation functions in different model architectures and data domains. Interestingly, we found that the application of L-ReLU is mostly effective when the number of trainable parameters in a model is relatively small. Furthermore, we found that the image classification models seem to perform well with L-ReLU in fully connected layers, especially when pre-trained models such as the VGG-16 are used for the transfer learning.