Abstract:Constructing production-quality machine-learned interatomic potentials (MLIPs) requires balancing accuracy, dynamical stability, and computational throughput under constraints that are not captured by a single training loss. We introduce MLIPilot, an auto-research framework in which tool-calling large language models propose hypotheses, edit MLIP training code, launch HPC jobs, and accept or revert changes using a fixed, physically constrained scorecard. We evaluate MLIPilot on MACE potential optimization using both commercial and open-weight LLM agents, including GPT-5.5, GPT-4.1, Mistral-24B, and Qwen3-32B. The benchmarks span molecular and periodic settings: a QM7-derived dataset for which we generated B3LYP/6-31G(d) energies and forces, and a Cu EMT dataset with periodic copper supercells labeled by ASE's Effective Medium Theory calculator. Across these benchmarks, the strongest agents move initially constraint-violating baselines to accepted models by discovering useful training strategies, including output normalization, loss-function changes, progressive training schedules, and model-capacity adjustments. These results suggest that LLM agents can serve as autonomous operators for scientific machine-learning workflows when their search is constrained by domain-specific validation criteria, shifting part of MLIP development from manual trial-and-error toward auditable, automated experimentation.
Abstract:The increasing importance of carbon capture technologies for deployment in remediating CO2 emissions, and thus the necessity to improve capture materials to allow scalability and efficiency, faces the challenge of materials development, which can require substantial costs and time. Machine learning offers a promising method for reducing the time and resource burdens of materials development through efficient correlation of structure-property relationships to allow down-selection and focusing on promising candidates. Towards demonstrating this, we have developed an end-to-end "discovery cycle" to select new aqueous amines compatible with the commercially viable acid gas scrubbing carbon capture. We combine a simple, rapid laboratory assay for CO2 absorption with a machine learning based molecular fingerprinting model approach. The prediction process shows 60% accuracy against experiment for both material parameters and 80% for a single parameter on an external test set. The discovery cycle determined several promising amines that were verified experimentally, and which had not been applied to carbon capture previously. In the process we have compiled a large, single-source data set for carbon capture amines and produced an open source machine learning tool for the identification of amine molecule candidates (https://github.com/IBM/Carbon-capture-fingerprint-generation).