Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Elsa A. Olivetti

Language Models Enable Data-Augmented Synthesis Planning for Inorganic Materials

Jun 14, 2025

Thorben Prein, Elton Pan, Janik Jehkul, Steffen Weinmann, Elsa A. Olivetti, Jennifer L. M. Rupp

Abstract:Inorganic synthesis planning currently relies primarily on heuristic approaches or machine-learning models trained on limited datasets, which constrains its generality. We demonstrate that language models, without task-specific fine-tuning, can recall synthesis conditions. Off-the-shelf models, such as GPT-4.1, Gemini 2.0 Flash and Llama 4 Maverick, achieve a Top-1 precursor-prediction accuracy of up to 53.8 % and a Top-5 performance of 66.1 % on a held-out set of 1,000 reactions. They also predict calcination and sintering temperatures with mean absolute errors below 126 {\deg}C, matching specialized regression methods. Ensembling these language models further enhances predictive accuracy and reduces inference cost per prediction by up to 70 %. We subsequently employ language models to generate 28,548 synthetic reaction recipes, which we combine with literature-mined examples to pretrain a transformer-based model, SyntMTE. After fine-tuning on the combined dataset, SyntMTE reduces mean-absolute error in sintering temperature prediction to 73 {\deg}C and in calcination temperature to 98 {\deg}C. This strategy improves models by up to 8.7 % compared with baselines trained exclusively on experimental data. Finally, in a case study on Li7La3Zr2O12 solid-state electrolytes, we demonstrate that SyntMTE reproduces the experimentally observed dopant-dependent sintering trends. Our hybrid workflow enables scalable, data-efficient inorganic synthesis planning.

Via

Access Paper or Ask Questions

Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data

Dec 23, 2024

Hengrui Zhang, Alexandru B. Georgescu, Suraj Yerramilli, Christopher Karpovich, Daniel W. Apley, Elsa A. Olivetti, James M. Rondinelli, Wei Chen

Figure 1 for Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data

Figure 2 for Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data

Figure 3 for Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data

Figure 4 for Emerging Microelectronic Materials by Design: Navigating Combinatorial Design Space with Scarce and Dispersed Data

Abstract:The increasing demands of sustainable energy, electronics, and biomedical applications call for next-generation functional materials with unprecedented properties. Of particular interest are emerging materials that display exceptional physical properties, making them promising candidates in energy-efficient microelectronic devices. As the conventional Edisonian approach becomes significantly outpaced by growing societal needs, emerging computational modeling and machine learning (ML) methods are employed for the rational design of materials. However, the complex physical mechanisms, cost of first-principles calculations, and the dispersity and scarcity of data pose challenges to both physics-based and data-driven materials modeling. Moreover, the combinatorial composition-structure design space is high-dimensional and often disjoint, making design optimization nontrivial. In this Account, we review a team effort toward establishing a framework that integrates data-driven and physics-based methods to address these challenges and accelerate materials design. We begin by presenting our integrated materials design framework and its three components in a general context. We then provide an example of applying this materials design framework to metal-insulator transition (MIT) materials, a specific type of emerging materials with practical importance in next-generation memory technologies. We identify multiple new materials which may display this property and propose pathways for their synthesis. Finally, we identify some outstanding challenges in data-driven materials design, such as materials data quality issues and property-performance mismatch. We seek to raise awareness of these overlooked issues hindering materials design, thus stimulating efforts toward developing methods to mitigate the gaps.

* 22 pages, 8 figures

Via

Access Paper or Ask Questions