Abstract:Conditional differential entropy provides an intuitive measure for relatively ranking time-series complexity by quantifying uncertainty in future observations given past context. However, its direct computation for high-dimensional processes from unknown distributions is often intractable. This paper builds on the information theoretic prediction error bounds established by Fang et al. \cite{fang2019generic}, which demonstrate that the conditional differential entropy \textbf{$h(X_k \mid X_{k-1},...,X_{k-m})$} is upper bounded by a function of the determinant of the covariance matrix of next-step prediction errors for any next step prediction model. We add to this theoretical framework by further increasing this bound by leveraging Hadamard's inequality and the positive semi-definite property of covariance matrices. To see if these bounds can be used to rank the complexity of time series, we conducted two synthetic experiments: (1) controlled linear autoregressive processes with additive Gaussian noise, where we compare ordinary least squares prediction error entropy proxies to the true entropies of various additive noises, and (2) a complexity ranking task of bio-inspired synthetic audio data with unknown entropy, where neural network prediction errors are used to recover the known complexity ordering. This framework provides a computationally tractable method for time-series complexity ranking using prediction errors from next-step prediction models, that maintains a theoretical foundation in information theory.
Abstract:The Fundamental Theorem of Statistical Learning states that a hypothesis space is PAC learnable if and only if its VC dimension is finite. For the agnostic model of PAC learning, the literature so far presents proofs of this theorem that often tacitly impose several measurability assumptions on the involved sets and functions. We scrutinize these proofs from a measure-theoretic perspective in order to extract the assumptions needed for a rigorous argument. This leads to a sound statement as well as a detailed and self-contained proof of the Fundamental Theorem of Statistical Learning in the agnostic setting, showcasing the minimal measurability requirements needed. We then discuss applications in Model Theory, considering NIP and o-minimal structures. Our main theorem presents sufficient conditions for the PAC learnability of hypothesis spaces defined over o-minimal expansions of the reals.