Autonomous materials research labs require the ability to combine and learn from diverse data streams. This is especially true for learning material synthesis-process-structure-property relationships, key to accelerating materials optimization and discovery as well as accelerating mechanistic understanding. We present the Synthesis-process-structure-property relAtionship coreGionalized lEarner (SAGE) algorithm. A fully Bayesian algorithm that uses multimodal coregionalization to merge knowledge across data sources to learn synthesis-process-structure-property relationships. SAGE outputs a probabilistic posterior for the relationships including the most likely relationships given the data.
Advanced materials are needed to further next-generation technologies such as quantum computing, carbon capture, and low-cost medical imaging. However, advanced materials discovery is confounded by two fundamental challenges: the challenge of a high-dimensional, complex materials search space and the challenge of combining knowledge, i.e., data fusion across instruments and labs. To overcome the first challenge, researchers employ knowledge of the underlying material synthesis-structure-property relationship, as a material's structure is often predictive of its functional property and vice versa. For example, optimal materials often occur along composition-phase boundaries or within specific phase regions. Additionally, knowledge of the synthesis-structure-property relationship is fundamental to understanding underlying physical mechanisms. However, quantifying the synthesis-structure-property relationship requires overcoming the second challenge. Researchers must merge knowledge gathered across instruments, measurement modalities, and even laboratories. We present the Synthesis-structure-property relAtionship coreGionalized lEarner (SAGE) algorithm. A fully Bayesian algorithm that uses multimodal coregionalization to merge knowledge across data sources to learn synthesis-structure-property relationships.
Autonomous experimentation (AE) combines machine learning and research hardware automation in a closed loop, guiding subsequent experiments toward user goals. As applied to materials research, AE can accelerate materials exploration, reducing time and cost compared to traditional Edisonian studies. Additionally, integrating knowledge from diverse sources including theory, simulations, literature, and domain experts can boost AE performance. Domain experts may provide unique knowledge addressing tasks that are difficult to automate. Here, we present a set of methods for integrating human input into an autonomous materials exploration campaign for composition-structure phase mapping. The methods are demonstrated on x-ray diffraction data collected from a thin film ternary combinatorial library. At any point during the campaign, the user can choose to provide input by indicating regions-of-interest, likely phase regions, and likely phase boundaries based on their prior knowledge (e.g., knowledge of the phase map of a similar material system), along with quantifying their certainty. The human input is integrated by defining a set of probabilistic priors over the phase map. Algorithm output is a probabilistic distribution over potential phase maps, given the data, model, and human input. We demonstrate a significant improvement in phase mapping performance given appropriate human input.
The field of autonomous physical science - where machine learning guides and learns from experiments in a closed-loop - is rapidly growing in importance. Autonomous systems allow scientists to fail smarter, learn faster, and spend less resources in their studies. The field promises improved performance for various facilities such as labs, research and development pipelines, and warehouses. As autonomous systems grow in number, capability, and complexity, a new challenge arises - how will these systems work together across large facilities? We explore one solution to this question - a multi-agent framework. We demonstrate a framework with 1) a simulated facility with realistic resource limits such as equipment use limits, 2) machine learning agents with diverse learning capabilities and goals, control over lab instruments, and the ability to run research campaigns, and 3) a network over which these agents can share knowledge and work together to achieve individual or collective goals. The framework is dubbed the MULTI-agent auTonomous fAcilities - a Scalable frameworK aka MULTITASK. MULTITASK allows facility-wide simulations including agent-instrument and agent-agent interactions. Framework modularity allows real-world autonomous spaces to come on-line in phases, with simulated instruments gradually replaced by real-world instruments. Here we demonstrate the framework with a real-world materials science challenge of materials exploration and optimization in a simulated materials lab. We hope the framework opens new areas of research in agent-based facility control scenarios such as agent-to-agent markets and economies, management and decision-making structures, communication and data-sharing structures, and optimization strategies for agents and facilities including those based on game theory.
The next generation of physical science involves robot scientists - autonomous physical science systems capable of experimental design, execution, and analysis in a closed loop. Such systems have shown real-world success for scientific exploration and discovery, including the first discovery of a best-in-class material. To build and use these systems, the next generation workforce requires expertise in diverse areas including ML, control systems, measurement science, materials synthesis, decision theory, among others. However, education is lagging. Educators need a low-cost, easy-to-use platform to teach the required skills. Industry can also use such a platform for developing and evaluating autonomous physical science methodologies. We present the next generation in science education, a kit for building a low-cost autonomous scientist. The kit was used during two courses at the University of Maryland to teach undergraduate and graduate students autonomous physical science. We discuss its use in the course and its greater capability to teach the dual tasks of autonomous model exploration, optimization, and determination, with an example of autonomous experimental "discovery" of the Henderson-Hasselbalch equation.
Autonomous physical science is revolutionizing materials science. In these systems, machine learning controls experiment design, execution, and analysis in a closed loop. Active learning, the machine learning field of optimal experiment design, selects each subsequent experiment to maximize knowledge toward the user goal. Autonomous system performance can be further improved with implementation of scientific machine learning, also known as inductive bias-engineered artificial intelligence, which folds prior knowledge of physical laws (e.g., Gibbs phase rule) into the algorithm. As the number, diversity, and uses for active learning strategies grow, there is an associated growing necessity for real-world reference datasets to benchmark strategies. We present a reference dataset and demonstrate its use to benchmark active learning strategies in the form of various acquisition functions. Active learning strategies are used to rapidly identify materials with optimal physical properties within a ternary materials system. The data is from an actual Fe-Co-Ni thin-film library and includes previously acquired experimental data for materials compositions, X-ray diffraction patterns, and two functional properties of magnetic coercivity and the Kerr rotation. Popular active learning methods along with a recent scientific active learning method are benchmarked for their materials optimization performance. We discuss the relationship between algorithm performance, materials search space complexity, and the incorporation of prior knowledge.
Application of artificial intelligence (AI), and more specifically machine learning, to the physical sciences has expanded significantly over the past decades. In particular, science-informed AI or scientific AI has grown from a focus on data analysis to now controlling experiment design, simulation, execution and analysis in closed-loop autonomous systems. The CAMEO (closed-loop autonomous materials exploration and optimization) algorithm employs scientific AI to address two tasks: learning a material system's composition-structure relationship and identifying materials compositions with optimal functional properties. By integrating these, accelerated materials screening across compositional phase diagrams was demonstrated, resulting in the discovery of a best-in-class phase change memory material. Key to this success is the ability to guide subsequent measurements to maximize knowledge of the composition-structure relationship, or phase map. In this work we investigate the benefits of incorporating varying levels of prior physical knowledge into CAMEO's autonomous phase-mapping. This includes the use of ab-initio phase boundary data from the AFLOW repositories, which has been shown to optimize CAMEO's search when used as a prior.