Abstract:We study sample quantiles of distributions indexed by estimated parameters, with a on Value-at-Risk related to linear projections of financial returns that whose underlying probability law is heavy-tailed. In this setting, the projection direction and the empirical quantile threshold are estimated from the data, so the standard Bahadur representation under a fixed distribution does not separate the distinct sources of instability. A canonical starting point is Bahadur's representation, which expresses the sample quantile through the empirical distribution function plus a remainder term \cite{bahadur1966}. Empirical-process theory provides a usable scaffolding through the mechanics of half-spaces, symmetric differences, and Glivenko--Cantelli uniform convergence. They yield stability bounds, but absorb changes in projection direction and changes in quantile threshold into a single symmetric-difference measure. Interestingly, a global uniform-convergence requirement is imposed on what is intrinsically a local quantile-stability problem. This paper introduces a Q-Q orthogonality formulation for separating projection-direction and quantile-threshold effects. The object of interest is the difference between the empirical quantile computed using the estimated projection direction and the population quantile computed at the reference projection direction. We decompose this difference into three terms, $\hat q_α(\hat w)-q_α(w_0)=D_1+D_2+D_3$. Here, $D_1$ measures the population quantile movement induced by perturbing the projection direction, $D_2$ measures the empirical quantile fluctuation with the projection direction held fixed, and $D_3$ is the Bahadur-type remainder.




Abstract:Current literature in criminal justice analytics often focuses on predicting the likelihood of recidivism (repeat offenses committed by released defendants), but this problem is fraught with ethical missteps ranging from selection bias in data collection to model interpretability. This paper re-purposes Machine Learning (ML) in criminal justice to identify social determinants of recidivism, with contributions along three dimensions. (1) We shift the focus from predicting which individuals will re-offend to identifying the broader underlying factors that explain differences in recidivism, with the goal of providing a reliable framework for preventative policy intervention. (2) Recidivism models typically agglomerate all individuals into one dataset to carry out ML tasks. We instead apply unsupervised learning to reduce noise and extract homogeneous subgroups of individuals, with a novel heuristic to find the optimal number of subgroups. (3) We subsequently apply supervised learning within the subgroups to determine statistically significant features that are correlated to recidivism. It is our view that this new approach to a long-standing question will serve as a useful guide for the practical application of ML in policymaking.




Abstract:We present algorithms for the detection of a class of heart arrhythmias with the goal of eventual adoption by practicing cardiologists. In clinical practice, detection is based on a small number of meaningful features extracted from the heartbeat cycle. However, techniques proposed in the literature use high dimensional vectors consisting of morphological, and time based features for detection. Using electrocardiogram (ECG) signals, we found smaller subsets of features sufficient to detect arrhythmias with high accuracy. The features were found by an iterative step-wise feature selection method. We depart from common literature in the following aspects: 1. As opposed to a high dimensional feature vectors, we use a small set of features with meaningful clinical interpretation, 2. we eliminate the necessity of short-duration patient-specific ECG data to append to the global training data for classification 3. We apply semi-parametric classification procedures (in an ensemble framework) for arrhythmia detection, and 4. our approach is based on a reduced sampling rate of ~ 115 Hz as opposed to 360 Hz in standard literature.