Prior knowledge and symbolic rules in machine learning are often expressed in the form of label constraints, especially in structured prediction problems. In this work, we compare two common strategies for encoding label constraints in a machine learning pipeline, regularization with constraints and constrained inference, by quantifying their impact on model performance. For regularization, we show that it narrows the generalization gap by precluding models that are inconsistent with the constraints. However, its preference for small violations introduces a bias toward a suboptimal model. For constrained inference, we show that it reduces the population risk by correcting a model's violation, and hence turns the violation into an advantage. Given these differences, we further explore the use of two approaches together and propose conditions for constrained inference to compensate for the bias introduced by regularization, aiming to improve both the model complexity and optimal risk.
With significant advances in generative AI, new technologies are rapidly being deployed with generative components. Generative models are typically trained on large datasets, resulting in model behaviors that can mimic the worst of the content in the training data. Responsible deployment of generative technologies requires content moderation strategies, such as safety input and output filters. Here, we provide a theoretical framework for conceptualizing responsible content moderation of text-to-image generative technologies, including a demonstration of how to empirically measure the constructs we enumerate. We define and distinguish the concepts of safety, fairness, and metric equity, and enumerate example harms that can come in each domain. We then provide a demonstration of how the defined harms can be quantified. We conclude with a summary of how the style of harms quantification we demonstrate enables data-driven content moderation decisions.
We propose a novel algorithm, Salient Conditional Diffusion (Sancdifi), a state-of-the-art defense against backdoor attacks. Sancdifi uses a denoising diffusion probabilistic model (DDPM) to degrade an image with noise and then recover said image using the learned reverse diffusion. Critically, we compute saliency map-based masks to condition our diffusion, allowing for stronger diffusion on the most salient pixels by the DDPM. As a result, Sancdifi is highly effective at diffusing out triggers in data poisoned by backdoor attacks. At the same time, it reliably recovers salient features when applied to clean data. This performance is achieved without requiring access to the model parameters of the Trojan network, meaning Sancdifi operates as a black-box defense.
We develop a formalism for constructing stochastic upper bounds on the expected full sample risk for supervised classification tasks via the Hilbert coresets approach within a transductive framework. We explicitly compute tight and meaningful bounds for complex datasets and complex hypothesis classes such as state-of-the-art deep neural network architectures. The bounds we develop exhibit nice properties: i) the bounds are non-uniform in the hypothesis space, ii) in many practical examples, the bounds become effectively deterministic by appropriate choice of prior and training data-dependent posterior distributions on the hypothesis space, and iii) the bounds become significantly better with increase in the size of the training set. We also lay out some ideas to explore for future research.
We present several new results on the feasibility of inferring the hidden states in strongly-connected trackable weak models. Here, a weak model is a directed graph in which each node is assigned a set of colors which may be emitted when that node is visited. A hypothesis is a node sequence which is consistent with a given color sequence. A weak model is said to be trackable if the worst case number of such hypotheses grows as a polynomial in the sequence length. We show that the number of hypotheses in strongly-connected trackable models is bounded by a constant and give an expression for this constant. We also consider the problem of reconstructing which branch was taken at a node with same-colored out-neighbors, and show that it is always eventually possible to identify which branch was taken if the model is strongly connected and trackable. We illustrate these properties by assigning transition probabilities and employing standard tools for analyzing Markov chains. In addition, we present new results for the entropy rates of weak models according to whether they are trackable or not. These theorems indicate that the combination of trackability and strong connectivity dramatically simplifies the task of reconstructing which nodes were visited. This work has implications for any problem which can be described in terms of an agent traversing a colored graph, such as the reconstruction of hidden states in a hidden Markov model (HMM).
A colored graph is a directed graph in which either nodes or edges have been assigned colors that are not necessarily unique. Observability problems in such graphs are concerned with whether an agent observing the colors of edges or nodes traversed on a path in the graph can determine which node they are at currently or which nodes they have visited earlier in the path traversal. Previous research efforts have identified several different notions of observability as well as the associated properties of colored graphs for which those types of observability properties hold. This paper unifies the prior work into a common framework with several new analytic results about relationships between those notions and associated graph properties. The new framework provides an intuitive way to reason about the attainable path reconstruction accuracy as a function of lag and time spent observing, and identifies simple modifications that improve the observability properties of a given graph. This intuition is borne out in a series of numerical experiments. This work has implications for problems that can be described in terms of an agent traversing a colored graph, including the reconstruction of hidden states in a hidden Markov model (HMM).
In this paper, we describe the mechanical design, system overview, integration and control techniques associated with SKALA, a unique large-sized robot for carrying a person with physical disabilities, up and down staircases. As a regular wheelchair is unable to perform such a maneuver, the system functions as a non-conventional wheelchair with several intelligent features. We describe the unique mechanical design and the design choices associated with it. We showcase the embedded control architecture that allows for several different modes of teleoperation, all of which have been described in detail. We further investigate the architecture associated with the autonomous operation of the system.