Picture for Yaodong Yang

Yaodong Yang

INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations

Add code
Mar 19, 2024
Viaarxiv icon

Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects

Add code
Mar 01, 2024
Figure 1 for Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects
Figure 2 for Incentive Compatibility for AI Alignment in Sociotechnical Systems: Positions and Prospects
Viaarxiv icon

Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective

Add code
Feb 20, 2024
Viaarxiv icon

Aligner: Achieving Efficient Alignment through Weak-to-Strong Correction

Add code
Feb 06, 2024
Viaarxiv icon

Panacea: Pareto Alignment via Preference Adaptation for LLMs

Add code
Feb 03, 2024
Figure 1 for Panacea: Pareto Alignment via Preference Adaptation for LLMs
Figure 2 for Panacea: Pareto Alignment via Preference Adaptation for LLMs
Figure 3 for Panacea: Pareto Alignment via Preference Adaptation for LLMs
Figure 4 for Panacea: Pareto Alignment via Preference Adaptation for LLMs
Viaarxiv icon

CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents

Add code
Jan 19, 2024
Figure 1 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Figure 2 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Figure 3 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Figure 4 for CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents
Viaarxiv icon

A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning

Add code
Dec 12, 2023
Figure 1 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 2 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 3 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Figure 4 for A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning
Viaarxiv icon

JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models

Add code
Nov 30, 2023
Viaarxiv icon

AI Alignment: A Comprehensive Survey

Add code
Nov 01, 2023
Viaarxiv icon

Grasp Multiple Objects with One Hand

Add code
Oct 24, 2023
Figure 1 for Grasp Multiple Objects with One Hand
Figure 2 for Grasp Multiple Objects with One Hand
Figure 3 for Grasp Multiple Objects with One Hand
Figure 4 for Grasp Multiple Objects with One Hand
Viaarxiv icon