Picture for Yaodong Yang

Yaodong Yang

Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models

Add code
Jun 15, 2024
Viaarxiv icon

Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning

Add code
Jun 12, 2024
Figure 1 for Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
Figure 2 for Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
Figure 3 for Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
Figure 4 for Efficient Adaptation in Mixed-Motive Environments via Hierarchical Opponent Modeling and Planning
Viaarxiv icon

Language Models Resist Alignment

Add code
Jun 10, 2024
Figure 1 for Language Models Resist Alignment
Figure 2 for Language Models Resist Alignment
Figure 3 for Language Models Resist Alignment
Figure 4 for Language Models Resist Alignment
Viaarxiv icon

Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles

Add code
Jun 03, 2024
Viaarxiv icon

Efficient Model-agnostic Alignment via Bayesian Persuasion

Add code
May 29, 2024
Viaarxiv icon

INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations

Add code
Mar 19, 2024
Viaarxiv icon

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Add code
Mar 19, 2024
Viaarxiv icon

UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy

Add code
Mar 19, 2024
Figure 1 for UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy
Figure 2 for UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy
Figure 3 for UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy
Figure 4 for UniDexFPM: Universal Dexterous Functional Pre-grasp Manipulation Via Diffusion Policy
Viaarxiv icon

Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects

Add code
Mar 01, 2024
Viaarxiv icon

Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective

Add code
Feb 20, 2024
Viaarxiv icon