Alert button

Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code

Jul 15, 2023
Bin Lei, Caiwen Ding, Le Chen, Pei-Hung Lin, Chunhua Liao

Figure 1 for Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code
Figure 2 for Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code
Figure 3 for Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code
Figure 4 for Creating a Dataset Supporting Translation Between OpenMP Fortran and C++ Code

Share this with someone who'll enjoy it:

In this study, we present a novel dataset for training machine learning models translating between OpenMP Fortran and C++ code. To ensure reliability and applicability, the dataset is initially refined using a meticulous code similarity test. The effectiveness of our dataset is assessed using both quantitative (CodeBLEU) and qualitative (human evaluation) methods. We demonstrate how this dataset can significantly improve the translation capabilities of large-scale language models, with improvements of \times 5.1 for models with no prior coding knowledge and \times 9.9 for models with some coding familiarity. Our work highlights the potential of this dataset to advance the field of code translation for high-performance computing.

View paper onarxiv icon

Share this with someone who'll enjoy it: