Abstract:We study whether multimodal large language models (MLLMs) can leverage crystallographic plane indices (Miller indices) as a structured latent representation for reasoning about fracture geometry. We formulate Miller indices $z = (h,k,l)$ as a latent variable governing idealized planar fracture and evaluate two complementary capabilities: (i) latent inference, where the model maps visual observations to plane hypotheses under physically valid conditions, and (ii) latent applicability assessment, where the model determines whether such a representation is meaningful for a given fracture image. Through extensive experiments spanning synthetic data, controlled 2D--3D geometric pairs, and real-world fracture images across multiple material classes -- including ceramics, glass, metals, and concrete -- we show that MLLMs can reliably perform latent inference in idealized settings and, critically, can reject the latent representation when the underlying physics does not support it. These results suggest that MLLMs can act as physics-aware reasoning systems conditioned on structured latent priors, provided that the domain of validity is explicitly modeled.
Abstract:This study develops a graph search algorithm to find the optimal discrimination path for the binary classification problem. The objective function is defined as the difference of variations between the true positive (TP) and false positive (FP). It uses the depth first search (DFS) algorithm to find the top-down paths for discrimination. It proposes a dynamic optimization procedure to optimize TP at the upper levels and then reduce FP at the lower levels. To accelerate computing speed with improving accuracy, it proposes a reduced histogram algorithm with variable bin size instead of looping over all data points, to find the feature threshold of discrimination. The algorithm is applied on top of a Support Vector Machine (SVM) model for a binary classification problem on whether a person is fit or unfit. It significantly improves TP and reduces FP of the SVM results (e.g., reduced FP by 90% with a loss of only\ 5% TP). The graph search auto-generates 39 ranked discrimination paths within 9 seconds on an input of total 328,464 objects, using a dual-core Laptop computer with a processor of 2.59 GHz.
Abstract:This study proposes a Newton based multiple objective optimization algorithm for hyperparameter search. The first order differential (gradient) is calculated using finite difference method and a gradient matrix with vectorization is formed for fast computation. The Newton Raphson iterative solution is used to update model parameters with iterations, and a regularization term is included to eliminate the singularity issue. The algorithm is applied to search the optimal probability threshold (a vector of eight parameters) for a multiclass object detection problem of a convolutional neural network. The algorithm quickly finds the improved parameter values to produce an overall higher true positive (TP) and lower false positive (FP) rates, as compared to using the default value of 0.5. In comparison, the Bayesian optimization generates lower performance in the testing case. However, the performance and parameter values may oscillate for some cases during iterations, which may be due to the data driven stochastic nature of the subject. Therefore, the optimal parameter value can be identified from a list of iteration steps according to the optimal TP and FP results.