Picture for Kai Xiao

Kai Xiao

Tony

IH-Challenge: A Training Dataset to Improve Instruction Hierarchy on Frontier LLMs

Add code
Mar 11, 2026
Viaarxiv icon

OpenAI GPT-5 System Card

Add code
Dec 19, 2025
Viaarxiv icon

Human-AI collaborative autonomous synthesis with pulsed laser deposition for remote epitaxy

Add code
Nov 14, 2025
Viaarxiv icon

Trading Inference-Time Compute for Adversarial Robustness

Add code
Jan 31, 2025
Figure 1 for Trading Inference-Time Compute for Adversarial Robustness
Figure 2 for Trading Inference-Time Compute for Adversarial Robustness
Figure 3 for Trading Inference-Time Compute for Adversarial Robustness
Figure 4 for Trading Inference-Time Compute for Adversarial Robustness
Viaarxiv icon

Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning

Add code
Dec 24, 2024
Figure 1 for Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning
Figure 2 for Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning
Figure 3 for Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning
Figure 4 for Diverse and Effective Red Teaming with Auto-generated Rewards and Multi-step Reinforcement Learning
Viaarxiv icon

OpenAI o1 System Card

Add code
Dec 21, 2024
Figure 1 for OpenAI o1 System Card
Figure 2 for OpenAI o1 System Card
Figure 3 for OpenAI o1 System Card
Figure 4 for OpenAI o1 System Card
Viaarxiv icon

The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions

Add code
Apr 19, 2024
Figure 1 for The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Figure 2 for The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Figure 3 for The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Figure 4 for The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions
Viaarxiv icon

Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding

Add code
Apr 06, 2023
Figure 1 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding
Figure 2 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding
Figure 3 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding
Figure 4 for Quantifying and Defending against Privacy Threats on Federated Knowledge Graph Embedding
Viaarxiv icon

On Distinctive Properties of Universal Perturbations

Add code
Dec 31, 2021
Figure 1 for On Distinctive Properties of Universal Perturbations
Figure 2 for On Distinctive Properties of Universal Perturbations
Figure 3 for On Distinctive Properties of Universal Perturbations
Figure 4 for On Distinctive Properties of Universal Perturbations
Viaarxiv icon

SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing

Add code
Jul 03, 2021
Figure 1 for SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing
Figure 2 for SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing
Figure 3 for SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing
Figure 4 for SHORING: Design Provable Conditional High-Order Interaction Network via Symbolic Testing
Viaarxiv icon