Abstract:Qualitative analysis of open-ended survey responses is a commonly-used research method in the social sciences, but traditional coding approaches are often time-consuming and prone to inconsistency. Existing solutions from Natural Language Processing such as supervised classifiers, topic modeling techniques, and generative large language models have limited applicability in qualitative analysis, since they demand extensive labeled data, disrupt established qualitative workflows, and/or yield variable results. In this paper, we introduce a text embedding-based classification framework that requires only a handful of examples per category and fits well with standard qualitative workflows. When benchmarked against human analysis of a conceptual physics survey consisting of 2899 open-ended responses, our framework achieves a Cohen's Kappa ranging from 0.74 to 0.83 as compared to expert human coders in an exhaustive coding scheme. We further show how performance of this framework improves with fine-tuning of the text embedding model, and how the method can be used to audit previously-analyzed datasets. These findings demonstrate that text embedding-assisted coding can flexibly scale to thousands of responses without sacrificing interpretability, opening avenues for deductive qualitative analysis at scale.
Abstract:This Letter introduces an approach for precisely designing surface friction properties using a conditional generative machine learning model, specifically a diffusion denoising probabilistic model (DDPM). We created a dataset of synthetic surfaces with frictional properties determined by molecular dynamics simulations, which trained the DDPM to predict surface structures from desired frictional outcomes. Unlike traditional trial-and-error and numerical optimization methods, our approach directly yields surface designs meeting specified frictional criteria with high accuracy and efficiency. This advancement in material surface engineering demonstrates the potential of machine learning in reducing the iterative nature of surface design processes. Our findings not only provide a new pathway for precise surface property tailoring but also suggest broader applications in material science where surface characteristics are critical.