Abstract:Automated decision-making is becoming key for automated characterization including electron and scanning probe microscopies and nano indentation. Most machine learning driven workflows optimize a single predefined objective and tend to converge prematurely on familiar responses, overlooking rare but scientifically important states. More broadly, the challenge is not only where to measure next, but how to coordinate exploration across structural, spectral, and measurement spaces under finite experimental budgets while balancing target-driven optimization with novelty discovery. Here we introduce PATHFINDER, a framework for autonomous microscopy that combines novelty driven exploration with optimization, helping the system discover more diverse and useful representations across structural, spectral, and measurement spaces. By combining latent space representations of local structure, surrogate modeling of functional response, and Pareto-based acquisition, the framework selects measurements that balance novelty discovery in feature and object space and are informative and experimentally actionable. Benchmarked on pre acquired STEM EELS data and realized experimentally in scanning probe microscopy of ferroelectric materials, this approach expands the accessible structure property landscape and avoids collapse onto a single apparent optimum. These results point to a new mode of autonomous microscopy that is not only optimization-driven, but also discovery-oriented, broad in its search, and responsive to human guidance.
Abstract:Modern automated microscopy faces a fundamental discovery challenge: in many systems, the most important scientific information does not reside in the immediately visible image features, but in the target space of sequentially acquired spectra or functional responses, making it essential to develop strategies that can actively search for new behaviors rather than simply optimize known objectives. Here, we developed a deep-kernel-learning BEACON framework that is explicitly designed to guide discovery in the target space by learning structure-property relationships during the experiment and using that evolving model to seek diverse response regimes. We first established the method through demonstration workflows built on pre-acquired ground-truth datasets, which enabled direct benchmarking against classical acquisition strategies and allowed us to define a set of monitoring functions for comparing exploration quality, target-space coverage, and surrogate-model behavior in a transparent and reproducible manner. This benchmarking framework provides a practical basis for evaluating discovery-driven algorithms, not just optimization performance. We then operationalized and deployed the workflow on STEM, showing that the approach can transition from offline validation to real experimental implementation. To support adoption and extension by the broader community, the associated notebooks are available, allowing users to reproduce the workflows, test the benchmarks, and adapt the method to their own instruments and datasets.




Abstract:Microscopy is a primary source of information on materials structure and functionality at nanometer and atomic scales. The data generated is often well-structured, enriched with metadata and sample histories, though not always consistent in detail or format. The adoption of Data Management Plans (DMPs) by major funding agencies promotes preservation and access. However, deriving insights remains difficult due to the lack of standardized code ecosystems, benchmarks, and integration strategies. As a result, data usage is inefficient and analysis time is extensive. In addition to post-acquisition analysis, new APIs from major microscope manufacturers enable real-time, ML-based analytics for automated decision-making and ML-agent-controlled microscope operation. Yet, a gap remains between the ML and microscopy communities, limiting the impact of these methods on physics, materials discovery, and optimization. Hackathons help bridge this divide by fostering collaboration between ML researchers and microscopy experts. They encourage the development of novel solutions that apply ML to microscopy, while preparing a future workforce for instrumentation, materials science, and applied ML. This hackathon produced benchmark datasets and digital twins of microscopes to support community growth and standardized workflows. All related code is available at GitHub: https://github.com/KalininGroup/Mic-hackathon-2024-codes-publication/tree/1.0.0.1
Abstract:Ferroelectric polarization switching underpins the functional performance of a wide range of materials and devices, yet its dependence on complex local microstructural features renders systematic exploration by manual or grid-based spectroscopic measurements impractical. Here, we introduce a multi-objective kernel-learning workflow that infers the microstructural rules governing switching behavior directly from high-resolution imaging data. Applied to automated piezoresponse force microscopy (PFM) experiments, our framework efficiently identifies the key relationships between domain-wall configurations and local switching kinetics, revealing how specific wall geometries and defect distributions modulate polarization reversal. Post-experiment analysis projects abstract reward functions, such as switching ease and domain symmetry, onto physically interpretable descriptors including domain configuration and proximity to boundaries. This enables not only high-throughput active learning, but also mechanistic insight into the microstructural control of switching phenomena. While demonstrated for ferroelectric domain switching, our approach provides a powerful, generalizable tool for navigating complex, non-differentiable design spaces, from structure-property correlations in molecular discovery to combinatorial optimization across diverse imaging modalities.




Abstract:Domain-wall dynamics in ferroelectric materials are strongly position-dependent since each polar interface is locked into a unique local microstructure. This necessitates spatially resolved studies of the wall-pinning using scanning-probe microscopy techniques. The pinning centers and preexisting domain walls are usually sparse within image plane, precluding the use of dense hyperspectral imaging modes and requiring time-consuming human experimentation. Here, a large area epitaxial PbTiO$_3$ film on cubic KTaO$_3$ were investigated to quantify the electric field driven dynamics of the polar-strain domain structures using ML-controlled automated Piezoresponse Force Microscopy. Analysis of 1500 switching events reveals that domain wall displacement depends not only on field parameters but also on the local ferroelectric-ferroelastic configuration. For example, twin boundaries in polydomains regions like a$_1^-$/$c^+$ $\parallel$ a$_2^-$/$c^-$ stay pinned up to a certain level of bias magnitude and change only marginally as the bias increases from 20V to 30V, whereas single variant boundaries like a$_2^+$/$c^+$ $\parallel$ a$_2^-$/$c^-$ stack are already activated at 20V. These statistics on the possible ferroelectric and ferroelastic wall orientations, together with the automated, high-throughput AFM workflow, can be distilled into a predictive map that links domain configurations to pulse parameters. This microstructure-specific rule set forms the foundation for designing ferroelectric memories.




Abstract:Automated experimentation has the potential to revolutionize scientific discovery, but its effectiveness depends on well-defined optimization targets, which are often uncertain or probabilistic in real-world settings. In this work, we demonstrate the application of Multi-Objective Bayesian Optimization (MOBO) to balance multiple, competing rewards in autonomous experimentation. Using scanning probe microscopy (SPM) imaging, one of the most widely used and foundational SPM modes, we show that MOBO can optimize imaging parameters to enhance measurement quality, reproducibility, and efficiency. A key advantage of this approach is the ability to compute and analyze the Pareto front, which not only guides optimization but also provides physical insights into the trade-offs between different objectives. Additionally, MOBO offers a natural framework for human-in-the-loop decision-making, enabling researchers to fine-tune experimental trade-offs based on domain expertise. By standardizing high-quality, reproducible measurements and integrating human input into AI-driven optimization, this work highlights MOBO as a powerful tool for advancing autonomous scientific discovery.
Abstract:Knowledge driven discovery of novel materials necessitates the development of the causal models for the property emergence. While in classical physical paradigm the causal relationships are deduced based on the physical principles or via experiment, rapid accumulation of observational data necessitates learning causal relationships between dissimilar aspects of materials structure and functionalities based on observations. For this, it is essential to integrate experimental data with prior domain knowledge. Here we demonstrate this approach by combining high-resolution scanning transmission electron microscopy (STEM) data with insights derived from large language models (LLMs). By fine-tuning ChatGPT on domain-specific literature, such as arXiv papers on ferroelectrics, and combining obtained information with data-driven causal discovery, we construct adjacency matrices for Directed Acyclic Graphs (DAGs) that map the causal relationships between structural, chemical, and polarization degrees of freedom in Sm-doped BiFeO3 (SmBFO). This approach enables us to hypothesize how synthesis conditions influence material properties, particularly the coercive field (E0), and guides experimental validation. The ultimate objective of this work is to develop a unified framework that integrates LLM-driven literature analysis with data-driven discovery, facilitating the precise engineering of ferroelectric materials by establishing clear connections between synthesis conditions and their resulting material properties.




Abstract:We introduce a Deep Kernel Learning Variational Autoencoder (VAE-DKL) framework that integrates the generative power of a Variational Autoencoder (VAE) with the predictive nature of Deep Kernel Learning (DKL). The VAE learns a latent representation of high-dimensional data, enabling the generation of novel structures, while DKL refines this latent space by structuring it in alignment with target properties through Gaussian Process (GP) regression. This approach preserves the generative capabilities of the VAE while enhancing its latent space for GP-based property prediction. We evaluate the framework on two datasets: a structured card dataset with predefined variational factors and the QM9 molecular dataset, where enthalpy serves as the target function for optimization. The model demonstrates high-precision property prediction and enables the generation of novel out-of-training subset structures with desired characteristics. The VAE-DKL framework offers a promising approach for high-throughput material discovery and molecular design, balancing structured latent space organization with generative flexibility.




Abstract:Analyzing imaging and hyperspectral data is crucial across scientific fields, including biology, medicine, chemistry, and physics. The primary goal is to transform high-resolution or high-dimensional data into an interpretable format to generate actionable insights, aiding decision-making and advancing knowledge. Currently, this task relies on complex, human-designed workflows comprising iterative steps such as denoising, spatial sampling, keypoint detection, feature generation, clustering, dimensionality reduction, and physics-based deconvolutions. The introduction of machine learning over the past decade has accelerated tasks like image segmentation and object detection via supervised learning, and dimensionality reduction via unsupervised methods. However, both classical and NN-based approaches still require human input, whether for hyperparameter tuning, data labeling, or both. The growing use of automated imaging tools, from atomically resolved imaging to biological applications, demands unsupervised methods that optimize data representation for human decision-making or autonomous experimentation. Here, we discuss advances in reward-based workflows, which adopt expert decision-making principles and demonstrate strong transfer learning across diverse tasks. We represent image analysis as a decision-making process over possible operations and identify desiderata and their mappings to classical decision-making frameworks. Reward-driven workflows enable a shift from supervised, black-box models sensitive to distribution shifts to explainable, unsupervised, and robust optimization in image analysis. They can function as wrappers over classical and DCNN-based methods, making them applicable to both unsupervised and supervised workflows (e.g., classification, regression for structure-property mapping) across imaging and hyperspectral data.
Abstract:Combinatorial libraries are a powerful approach for exploring the evolution of physical properties across binary and ternary cross-sections in multicomponent phase diagrams. Although the synthesis of these libraries has been developed since the 1960s and expedited with advanced laboratory automation, the broader application of combinatorial libraries relies on fast, reliable measurements of concentration-dependent structures and functionalities. Scanning Probe Microscopies (SPM), including piezoresponse force microscopy (PFM), offer significant potential for quantitative, functionally relevant combi-library readouts. Here we demonstrate the implementation of fully automated SPM to explore the evolution of ferroelectric properties in combinatorial libraries, focusing on Sm-doped BiFeO3 and ZnxMg1-xO systems. We also present and compare Gaussian Process-based Bayesian Optimization models for fully automated exploration, emphasizing local reproducibility (effective noise) as an essential factor in optimal experiment workflows. Automated SPM, when coupled with upstream synthesis controls, plays a pivotal role in bridging materials synthesis and characterization.