Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Luis Rangel DaCosta

Contrast transfer functions help quantify neural network out-of-distribution generalization in HRTEM

Dec 09, 2025

Luis Rangel DaCosta, Mary C. Scott

Figure 1 for Contrast transfer functions help quantify neural network out-of-distribution generalization in HRTEM

Figure 2 for Contrast transfer functions help quantify neural network out-of-distribution generalization in HRTEM

Figure 3 for Contrast transfer functions help quantify neural network out-of-distribution generalization in HRTEM

Figure 4 for Contrast transfer functions help quantify neural network out-of-distribution generalization in HRTEM

Abstract:Neural networks, while effective for tackling many challenging scientific tasks, are not known to perform well out-of-distribution (OOD), i.e., within domains which differ from their training data. Understanding neural network OOD generalization is paramount to their successful deployment in experimental workflows, especially when ground-truth knowledge about the experiment is hard to establish or experimental conditions significantly vary. With inherent access to ground-truth information and fine-grained control of underlying distributions, simulation-based data curation facilitates precise investigation of OOD generalization behavior. Here, we probe generalization with respect to imaging conditions of neural network segmentation models for high-resolution transmission electron microscopy (HRTEM) imaging of nanoparticles, training and measuring the OOD generalization of over 12,000 neural networks using synthetic data generated via random structure sampling and multislice simulation. Using the HRTEM contrast transfer function, we further develop a framework to compare information content of HRTEM datasets and quantify OOD domain shifts. We demonstrate that neural network segmentation models enjoy significant performance stability, but will smoothly and predictably worsen as imaging conditions shift from the training distribution. Lastly, we consider limitations of our approach in explaining other OOD shifts, such as of the atomic structures, and discuss complementary techniques for understanding generalization in such settings.

Via

Access Paper or Ask Questions

A robust synthetic data generation framework for machine learning in High-Resolution Transmission Electron Microscopy (HRTEM)

Sep 12, 2023

Luis Rangel DaCosta, Katherine Sytwu, Catherine Groschner, Mary Scott

Figure 1 for A robust synthetic data generation framework for machine learning in High-Resolution Transmission Electron Microscopy (HRTEM)

Figure 2 for A robust synthetic data generation framework for machine learning in High-Resolution Transmission Electron Microscopy (HRTEM)

Figure 3 for A robust synthetic data generation framework for machine learning in High-Resolution Transmission Electron Microscopy (HRTEM)

Figure 4 for A robust synthetic data generation framework for machine learning in High-Resolution Transmission Electron Microscopy (HRTEM)

Abstract:Machine learning techniques are attractive options for developing highly-accurate automated analysis tools for nanomaterials characterization, including high-resolution transmission electron microscopy (HRTEM). However, successfully implementing such machine learning tools can be difficult due to the challenges in procuring sufficiently large, high-quality training datasets from experiments. In this work, we introduce Construction Zone, a Python package for rapidly generating complex nanoscale atomic structures, and develop an end-to-end workflow for creating large simulated databases for training neural networks. Construction Zone enables fast, systematic sampling of realistic nanomaterial structures, and can be used as a random structure generator for simulated databases, which is important for generating large, diverse synthetic datasets. Using HRTEM imaging as an example, we train a series of neural networks on various subsets of our simulated databases to segment nanoparticles and holistically study the data curation process to understand how various aspects of the curated simulated data -- including simulation fidelity, the distribution of atomic structures, and the distribution of imaging conditions -- affect model performance across several experimental benchmarks. Using our results, we are able to achieve state-of-the-art segmentation performance on experimental HRTEM images of nanoparticles from several experimental benchmarks and, further, we discuss robust strategies for consistently achieving high performance with machine learning in experimental settings using purely synthetic data.

Via

Access Paper or Ask Questions

Generalization Across Experimental Parameters in Machine Learning Analysis of High Resolution Transmission Electron Microscopy Datasets

Jun 20, 2023

Katherine Sytwu, Luis Rangel DaCosta, Mary C. Scott

Abstract:Neural networks are promising tools for high-throughput and accurate transmission electron microscopy (TEM) analysis of nanomaterials, but are known to generalize poorly on data that is "out-of-distribution" from their training data. Given the limited set of image features typically seen in high-resolution TEM imaging, it is unclear which images are considered out-of-distribution from others. Here, we investigate how the choice of metadata features in the training dataset influences neural network performance, focusing on the example task of nanoparticle segmentation. We train and validate neural networks across curated, experimentally-collected high-resolution TEM image datasets of nanoparticles under controlled imaging and material parameters, including magnification, dosage, nanoparticle diameter, and nanoparticle material. Overall, we find that our neural networks are not robust across microscope parameters, but do generalize across certain sample parameters. Additionally, data preprocessing heavily influences the generalizability of neural networks trained on nominally similar datasets. Our results highlight the need to understand how dataset features affect deployment of data-driven algorithms.

* 11 pages, 5 figures

Via

Access Paper or Ask Questions