The ability to rapidly develop materials with desired properties has a transformative impact on a broad range of emerging technologies. In this work, we introduce a new framework based on the diffusion model, a recent generative machine learning method to predict 3D structures of disordered materials from a target property. For demonstration, we apply the model to identify the atomic structures of amorphous carbons ($a$-C) as a representative material system from the target X-ray absorption near edge structure (XANES) spectra--a common experimental technique to probe atomic structures of materials. We show that conditional generation guided by XANES spectra reproduces key features of the target structures. Furthermore, we show that our model can steer the generative process to tailor atomic arrangements for a specific XANES spectrum. Finally, our generative model exhibits a remarkable scale-agnostic property, thereby enabling generation of realistic, large-scale structures through learning from a small-scale dataset (i.e., with small unit cells). Our work represents a significant stride in bridging the gap between materials characterization and atomic structure determination; in addition, it can be leveraged for materials discovery in exploring various material properties as targeted.
A new semi-supervised machine learning method for the discovery of structure-spectrum relationships is developed and demonstrated using the specific example of interpreting X-ray absorption near-edge structure (XANES) spectra. This method constructs a one-to-one mapping between individual structure descriptors and spectral trends. Specifically, an adversarial autoencoder is augmented with a novel rank constraint (RankAAE). The RankAAE methodology produces a continuous and interpretable latent space, where each dimension can track an individual structure descriptor. As a part of this process, the model provides a robust and quantitative measure of the structure-spectrum relationship by decoupling intertwined spectral contributions from multiple structural characteristics. This makes it ideal for spectral interpretation and the discovery of new descriptors. The capability of this procedure is showcased by considering five local structure descriptors and a database of over fifty thousand simulated XANES spectra across eight first-row transition metal oxide families. The resulting structure-spectrum relationships not only reproduce known trends in the literature, but also reveal unintuitive ones that are visually indiscernible in large data sets. The results suggest that the RankAAE methodology has great potential to assist researchers to interpret complex scientific data, test physical hypotheses, and reveal new patterns that extend scientific insight.
We employ variational autoencoders to extract physical insight from a dataset of one-particle Anderson impurity model spectral functions. Autoencoders are trained to find a low-dimensional, latent space representation that faithfully characterizes each element of the training set, as measured by a reconstruction error. Variational autoencoders, a probabilistic generalization of standard autoencoders, further condition the learned latent space to promote highly interpretable features. In our study, we find that the learned latent space components strongly correlate with well known, but nontrivial, parameters that characterize emergent behaviors in the Anderson impurity model. In particular, one latent space component correlates with particle-hole asymmetry, while another is in near one-to-one correspondence with the Kondo temperature, a dynamically generated low-energy scale in the impurity model. With symbolic regression, we model this component as a function of bare physical input parameters and "rediscover" the non-perturbative formula for the Kondo temperature. The machine learning pipeline we develop opens opportunities to discover new domain knowledge in other physical systems.