Abstract:As large language models (LLMs) continue to improve and see further integration into software systems, so does the need to understand the conditions in which they will perform. We contribute a statistical framework for understanding the impact of specific prompt features on LLM performance. The approach extends previous explainable artificial intelligence (XAI) methods specifically to inspect LLMs by fitting regression models relating portions of the prompt to LLM evaluation. We apply our method to compare how two open-source models, Mistral-7B and GPT-OSS-20B, leverage the prompt to perform a simple arithmetic problem. Regression models of individual prompt portions explain 72% and 77% of variation in model performances, respectively. We find misinformation in the form of incorrect example query-answer pairs impedes both models from solving the arithmetic query, though positive examples do not find significant variability in the impact of positive and negative instructions - these prompts have contradictory effects on model performance. The framework serves as a tool for decision makers in critical scenarios to gain granular insight into how the prompt influences an LLM to solve a task.




Abstract:The nematode Caenorhabditis elegans (C. elegans) is used as a model organism to better understand developmental biology and neurobiology. C. elegans features an invariant cell lineage, which has been catalogued and observed using fluorescence microscopy images. However, established methods to track cells in late-stage development fail to generalize once sporadic muscular twitching has begun. We build upon methodology which uses skin cells as fiducial markers to carry out cell tracking despite random twitching. In particular, we present a cell nucleus segmentation and tracking procedure which was integrated into a 3D rendering GUI to improve efficiency in tracking cells across late-stage development. Results on images depicting aforementioned muscle cell nuclei across three test embryos suggest the fiducial markers in conjunction with a classic tracking paradigm overcome sporadic twitching.




Abstract:Current methods in multiple object tracking (MOT) rely on independent object trajectories undergoing predictable motion to effectively track large numbers of objects. Adversarial conditions such as volatile object motion and imperfect detections create a challenging tracking landscape in which established methods may yield inadequate results. Multiple hypothesis hypergraph tracking (MHHT) is developed to perform MOT among interdependent objects amid noisy detections. The method extends traditional multiple hypothesis tracking (MHT) via hypergraphs to model correlated object motion, allowing for robust tracking in challenging scenarios. MHHT is applied to perform seam cell tracking during late-stage embryogenesis in embryonic C. elegans.




Abstract:Finding an optimal correspondence between point sets is a common task in computer vision. Existing techniques assume relatively simple relationships among points and do not guarantee an optimal match. We introduce an algorithm capable of exactly solving point set matching by modeling the task as hypergraph matching. The algorithm extends the classical branch and bound paradigm to select and aggregate vertices under a proposed decomposition of the multilinear objective function. The methodology is motivated by Caenorhabditis elegans, a model organism used frequently in developmental biology and neurobiology. The embryonic C. elegans contains seam cells that can act as fiducial markers allowing the identification of other nuclei during embryo development. The proposed algorithm identifies seam cells more accurately than established point-set matching methods, while providing a framework to approach other similarly complex point set matching tasks.