Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nicholas S. Kersting

Text Corpora as Concept Fields: Black-Box Hallucination and Novelty Measurement

May 06, 2026

Nicholas S. Kersting, Vittorio Castelli, Chieh Ting Yeh, Xinzhu Wang, Saad Taame

Abstract:We introduce the **Concept Field** of a text corpus: a local drift field with pointwise uncertainty, estimated in sentence-embedding space from the deltas between consecutive sentences. Given a candidate sentence transition, we score its agreement with the field by $ζ$, the mean absolute z-distance between the observed delta and the field's local Gaussian estimate. The score is black-box (no model internals), corpus-attributable (every score traces to nearby corpus sentences), and admits a direct probabilistic reading. We support the computation with the introduction of a **Vector Sequence Database (VSDB)** that stores embeddings together with sequence-position and next-delta metadata. We evaluate this approach on two large-scale settings: hallucination-style groundedness detection over the U.S. Code of Federal Regulations, and novelty detection over Project Gutenberg. Using controlled LLM-generated rewrites, Concept Fields achieve strong selective classification performance under a grounded / ungrounded / unsure triage policy, which unlike retrieval-centric baselines have similar coverage-risk behavior across both domains, supporting a probability-based interpretation that transfers across domains. We also sketch how divergence and curl of the Concept Field, computed on dense clusters, surface qualitatively meaningful semantic patterns (logic sources, sinks, and implicit topics), which we offer as hypothesis-generating rather than as a quantitative result. Concept Fields provide a fast, lightweight, and interpretable signal for groundedness and novelty, complementary to LLM-as-judge and white-box detectors.

* 25 pages, 8 figures

Via

Access Paper or Ask Questions

Harmonic LLMs are Trustworthy

Apr 30, 2024

Nicholas S. Kersting, Mohammad Rahman, Suchismitha Vedala, Yang Wang

Abstract:We introduce an intuitive method to test the robustness (stability and explainability) of any black-box LLM in real-time, based upon the local deviation from harmoniticity, denoted as $\gamma$. To the best of our knowledge this is the first completely model-agnostic and unsupervised method of measuring the robustness of any given response from an LLM, based upon the model itself conforming to a purely mathematical standard. We conduct human annotation experiments to show the positive correlation of $\gamma$ with false or misleading answers, and demonstrate that following the gradient of $\gamma$ in stochastic gradient ascent efficiently exposes adversarial prompts. Measuring $\gamma$ across thousands of queries in popular LLMs (GPT-4, ChatGPT, Claude-2.1, Mixtral-8x7B, Smaug-72B, Llama2-7B, and MPT-7B) allows us to estimate the liklihood of wrong or hallucinatory answers automatically and quantitatively rank the reliability of these models in various objective domains (Web QA, TruthfulQA, and Programming QA). Across all models and domains tested, human ratings confirm that $\gamma \to 0$ indicates trustworthiness, and the low-$\gamma$ leaders among these models are GPT-4, ChatGPT, and Smaug-72B.

* 15 pages, 4 figures, 14 tables

Via

Access Paper or Ask Questions

Harmonic Machine Learning Models are Robust

Apr 29, 2024

Nicholas S. Kersting, Yi Li, Aman Mohanty, Oyindamola Obisesan, Raphael Okochu

Figure 1 for Harmonic Machine Learning Models are Robust

Figure 2 for Harmonic Machine Learning Models are Robust

Figure 3 for Harmonic Machine Learning Models are Robust

Figure 4 for Harmonic Machine Learning Models are Robust

Abstract:We introduce Harmonic Robustness, a powerful and intuitive method to test the robustness of any machine-learning model either during training or in black-box real-time inference monitoring without ground-truth labels. It is based on functional deviation from the harmonic mean value property, indicating instability and lack of explainability. We show implementation examples in low-dimensional trees and feedforward NNs, where the method reliably identifies overfitting, as well as in more complex high-dimensional models such as ResNet-50 and Vision Transformer where it efficiently measures adversarial vulnerability across image classes.

* 18 pages, 13 figures

Via

Access Paper or Ask Questions