Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Cheol Woo Kim

Incentive-Aware AI Safety via Strategic Resource Allocation: A Stackelberg Security Games Perspective

Feb 06, 2026

Cheol Woo Kim, Davin Choo, Tzeh Yuan Neoh, Milind Tambe

Abstract:As AI systems grow more capable and autonomous, ensuring their safety and reliability requires not only model-level alignment but also strategic oversight of the humans and institutions involved in their development and deployment. Existing safety frameworks largely treat alignment as a static optimization problem (e.g., tuning models to desired behavior) while overlooking the dynamic, adversarial incentives that shape how data are collected, how models are evaluated, and how they are ultimately deployed. We propose a new perspective on AI safety grounded in Stackelberg Security Games (SSGs): a class of game-theoretic models designed for adversarial resource allocation under uncertainty. By viewing AI oversight as a strategic interaction between defenders (auditors, evaluators, and deployers) and attackers (malicious actors, misaligned contributors, or worst-case failure modes), SSGs provide a unifying framework for reasoning about incentive design, limited oversight capacity, and adversarial uncertainty across the AI lifecycle. We illustrate how this framework can inform (1) training-time auditing against data/feedback poisoning, (2) pre-deployment evaluation under constrained reviewer resources, and (3) robust multi-model deployment in adversarial environments. This synthesis bridges algorithmic alignment and institutional oversight design, highlighting how game-theoretic deterrence can make AI oversight proactive, risk-aware, and resilient to manipulation.

Via

Access Paper or Ask Questions

Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

Feb 06, 2025

Dimitris Bertsimas, Cheol Woo Kim, José Niño-Mora

Figure 1 for Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

Figure 2 for Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

Figure 3 for Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

Figure 4 for Optimal Control of Fluid Restless Multi-armed Bandits: A Machine Learning Approach

Abstract:We propose a machine learning approach to the optimal control of fluid restless multi-armed bandits (FRMABs) with state equations that are either affine or quadratic in the state variables. By deriving fundamental properties of FRMAB problems, we design an efficient machine learning based algorithm. Using this algorithm, we solve multiple instances with varying initial states to generate a comprehensive training set. We then learn a state feedback policy using Optimal Classification Trees with hyperplane splits (OCT-H). We test our approach on machine maintenance, epidemic control and fisheries control problems. Our method yields high-quality state feedback policies and achieves a speed-up of up to 26 million times compared to a direct numerical algorithm for fluid problems.

Via

Access Paper or Ask Questions

A Machine Learning Approach to Two-Stage Adaptive Robust Optimization

Jul 23, 2023

Dimitris Bertsimas, Cheol Woo Kim

Figure 1 for A Machine Learning Approach to Two-Stage Adaptive Robust Optimization

Figure 2 for A Machine Learning Approach to Two-Stage Adaptive Robust Optimization

Figure 3 for A Machine Learning Approach to Two-Stage Adaptive Robust Optimization

Figure 4 for A Machine Learning Approach to Two-Stage Adaptive Robust Optimization

Abstract:We propose an approach based on machine learning to solve two-stage linear adaptive robust optimization (ARO) problems with binary here-and-now variables and polyhedral uncertainty sets. We encode the optimal here-and-now decisions, the worst-case scenarios associated with the optimal here-and-now decisions, and the optimal wait-and-see decisions into what we denote as the strategy. We solve multiple similar ARO instances in advance using the column and constraint generation algorithm and extract the optimal strategies to generate a training set. We train a machine learning model that predicts high-quality strategies for the here-and-now decisions, the worst-case scenarios associated with the optimal here-and-now decisions, and the wait-and-see decisions. We also introduce an algorithm to reduce the number of different target classes the machine learning algorithm needs to be trained on. We apply the proposed approach to the facility location, the multi-item inventory control and the unit commitment problems. Our approach solves ARO problems drastically faster than the state-of-the-art algorithms with high accuracy.

Via

Access Paper or Ask Questions

Optimal Control of Multiclass Fluid Queueing Networks: A Machine Learning Approach

Jul 23, 2023

Dimitris Bertsimas, Cheol Woo Kim

Figure 1 for Optimal Control of Multiclass Fluid Queueing Networks: A Machine Learning Approach

Figure 2 for Optimal Control of Multiclass Fluid Queueing Networks: A Machine Learning Approach

Figure 3 for Optimal Control of Multiclass Fluid Queueing Networks: A Machine Learning Approach

Figure 4 for Optimal Control of Multiclass Fluid Queueing Networks: A Machine Learning Approach

Abstract:We propose a machine learning approach to the optimal control of multiclass fluid queueing networks (MFQNETs) that provides explicit and insightful control policies. We prove that a threshold type optimal policy exists for MFQNET control problems, where the threshold curves are hyperplanes passing through the origin. We use Optimal Classification Trees with hyperplane splits (OCT-H) to learn an optimal control policy for MFQNETs. We use numerical solutions of MFQNET control problems as a training set and apply OCT-H to learn explicit control policies. We report experimental results with up to 33 servers and 99 classes that demonstrate that the learned policies achieve 100\% accuracy on the test set. While the offline training of OCT-H can take days in large networks, the online application takes milliseconds.

Via

Access Paper or Ask Questions