Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ze Shen Chin

Open Problems in AI Incident Governance

Jul 06, 2026

Harleen Kaur Sidhu, Rebecca Scholefield, Nour Annan, Kevin Hernandez, Isabel Nieh Hou, Abdulrahman Alshaikhi, Ze Shen Chin, Rokas Gipiškis

Abstract:AI systems may produce failures after deployment that pre-deployment safety assessments do not anticipate. Managing these failures requires what we refer to as adequate \textit{AI incident governance}, where having good definitions, taxonomies, monitoring practices, reporting mechanisms, and incident analysis is essential. We examine existing frameworks related to AI incident governance by regulatory bodies and independent efforts, and find that while there are frameworks that describe how individual functions can be performed, there is a lack of consistency within the aspects of definitions, classification, monitoring, and reporting. These inconsistencies apply to the types of incident data that is collected and reported, the ways in which they are categorised, and as a result, the depth, representativeness, and accuracy of analysis that can be performed.

* Accepted to ICML Technical AI Governance Research workshop 2026

Via

Access Paper or Ask Questions

Principles and Guidelines for Randomized Controlled Trials in AI Evaluation

May 03, 2026

Christopher Kelly, Angelica Chowdhury, Alexandra Campili, Bimpe Ayoola, Devin Barbour, Thomas Chen Dawson, Ze Shen Chin, Rokas Gipiškis

Abstract:This work establishes a foundational framework for standardizing AI evaluation RCTs (sometimes called human uplift studies). Drawing on established experimental practices from disciplines with established RCT traditions, including software engineering, economics, clinical and health sciences, and psychology, we adopt the (Shadish et al., 2002) four-validity framework and extend it with a fifth principle on transparency, repeatability, and verification adapted from the Transparency and Openness Promotion (TOP) Guidelines (Center for Open Science, 2025). We operationalize all five principles into 33 guidelines adapted for AI evaluation RCT contexts, expressed as requirements with rationales, implementation instructions, and evidence bases. We position the principles and guidelines as serving three key roles for AI evaluation RCTs: a design tool for planning studies, an evaluation rubric for assessing existing work, and a blueprint for standard setting as the field converges on norms. Our framework extends prior work by centering evaluation on human performance rather than model output alone, formalizing causal inference through RCT methodology for AI contexts, integrating heterogeneity analysis and practical significance assessment, implementing a graded transparency and repeatability framework, and addressing AI-specific challenges including model versioning, human-AI interaction dynamics, contamination and spillover effects, and equitable impact assessment.

* 27 pages, Technical AI Safety Conference

Via

Access Paper or Ask Questions

Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks

Aug 08, 2025

Ze Shen Chin

Figure 1 for Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks

Figure 2 for Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks

Figure 3 for Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks

Figure 4 for Dimensional Characterization and Pathway Modeling for Catastrophic AI Risks

Abstract:Although discourse around the risks of Artificial Intelligence (AI) has grown, it often lacks a comprehensive, multidimensional framework, and concrete causal pathways mapping hazard to harm. This paper aims to bridge this gap by examining six commonly discussed AI catastrophic risks: CBRN, cyber offense, sudden loss of control, gradual loss of control, environmental risk, and geopolitical risk. First, we characterize these risks across seven key dimensions, namely intent, competency, entity, polarity, linearity, reach, and order. Next, we conduct risk pathway modeling by mapping step-by-step progressions from the initial hazard to the resulting harms. The dimensional approach supports systematic risk identification and generalizable mitigation strategies, while risk pathway models help identify scenario-specific interventions. Together, these methods offer a more structured and actionable foundation for managing catastrophic AI risks across the value chain.

* 24 pages including references, 6 figures. To be presented in Technical AI Governance Forum 2025

Via

Access Paper or Ask Questions

Existing Industry Practice for the EU AI Act's General-Purpose AI Code of Practice Safety and Security Measures

Apr 21, 2025

Lily Stelling, Mick Yang, Rokas Gipiškis, Leon Staufer, Ze Shen Chin, Siméon Campos, Michael Chen

Figure 1 for Existing Industry Practice for the EU AI Act's General-Purpose AI Code of Practice Safety and Security Measures

Abstract:This report provides a detailed comparison between the measures proposed in the EU AI Act's General-Purpose AI (GPAI) Code of Practice (Third Draft) and current practices adopted by leading AI companies. As the EU moves toward enforcing binding obligations for GPAI model providers, the Code of Practice will be key to bridging legal requirements with concrete technical commitments. Our analysis focuses on the draft's Safety and Security section which is only relevant for the providers of the most advanced models (Commitments II.1-II.16) and excerpts from current public-facing documents quotes that are relevant to each individual measure. We systematically reviewed different document types - including companies' frontier safety frameworks and model cards - from over a dozen companies, including OpenAI, Anthropic, Google DeepMind, Microsoft, Meta, Amazon, and others. This report is not meant to be an indication of legal compliance nor does it take any prescriptive viewpoint about the Code of Practice or companies' policies. Instead, it aims to inform the ongoing dialogue between regulators and GPAI model providers by surfacing evidence of precedent.

* 158 pages

Via

Access Paper or Ask Questions

Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

Oct 30, 2024

Rokas Gipiškis, Ayrton San Joaquin, Ze Shen Chin, Adrian Regenfuß, Ariel Gil, Koen Holtman

Figure 1 for Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

Figure 2 for Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

Figure 3 for Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

Figure 4 for Risk Sources and Risk Management Measures in Support of Standards for General-Purpose AI Systems

Abstract:There is an urgent need to identify both short and long-term risks from newly emerging types of Artificial Intelligence (AI), as well as available risk management measures. In response, and to support global efforts in regulating AI and writing safety standards, we compile an extensive catalog of risk sources and risk management measures for general-purpose AI (GPAI) systems, complete with descriptions and supporting examples where relevant. This work involves identifying technical, operational, and societal risks across model development, training, and deployment stages, as well as surveying established and experimental methods for managing these risks. To the best of our knowledge, this paper is the first of its kind to provide extensive documentation of both GPAI risk sources and risk management measures that are descriptive, self-contained and neutral with respect to any existing regulatory framework. This work intends to help AI providers, standards experts, researchers, policymakers, and regulators in identifying and mitigating systemic risks from GPAI systems. For this reason, the catalog is released under a public domain license for ease of direct use by stakeholders in AI governance and standards.

* 91 pages, 8 figures

Via

Access Paper or Ask Questions