Picture for Ruoxi Jia

Ruoxi Jia

Virginia Tech

Data-Centric Human Preference Optimization with Rationales

Add code
Jul 19, 2024
Viaarxiv icon

AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies

Add code
Jun 25, 2024
Figure 1 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Figure 2 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Figure 3 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Figure 4 for AI Risk Categorization Decoded (AIR 2024): From Government Regulations to Corporate Policies
Viaarxiv icon

Can We Trust the Performance Evaluation of Uncertainty Estimation Methods in Text Summarization?

Add code
Jun 25, 2024
Viaarxiv icon

BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models

Add code
Jun 24, 2024
Figure 1 for BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models
Figure 2 for BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models
Figure 3 for BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models
Figure 4 for BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models
Viaarxiv icon

SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

Add code
Jun 20, 2024
Viaarxiv icon

Data Shapley in One Training Run

Add code
Jun 16, 2024
Figure 1 for Data Shapley in One Training Run
Figure 2 for Data Shapley in One Training Run
Figure 3 for Data Shapley in One Training Run
Figure 4 for Data Shapley in One Training Run
Viaarxiv icon

Fairness-Aware Meta-Learning via Nash Bargaining

Add code
Jun 11, 2024
Figure 1 for Fairness-Aware Meta-Learning via Nash Bargaining
Figure 2 for Fairness-Aware Meta-Learning via Nash Bargaining
Figure 3 for Fairness-Aware Meta-Learning via Nash Bargaining
Figure 4 for Fairness-Aware Meta-Learning via Nash Bargaining
Viaarxiv icon

JIGMARK: A Black-Box Approach for Enhancing Image Watermarks against Diffusion Model Edits

Add code
Jun 06, 2024
Viaarxiv icon

AI Risk Management Should Incorporate Both Safety and Security

Add code
May 29, 2024
Figure 1 for AI Risk Management Should Incorporate Both Safety and Security
Viaarxiv icon

Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs

Add code
May 21, 2024
Viaarxiv icon