Picture for Simon S. Du

Simon S. Du

Frank

Improving Human-AI Coordination through Adversarial Training and Generative Models

Add code
Apr 21, 2025
Viaarxiv icon

Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination

Add code
Apr 20, 2025
Viaarxiv icon

Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback

Add code
Mar 11, 2025
Viaarxiv icon

Anytime Acceleration of Gradient Descent

Add code
Nov 26, 2024
Figure 1 for Anytime Acceleration of Gradient Descent
Figure 2 for Anytime Acceleration of Gradient Descent
Viaarxiv icon

Learning to Cooperate with Humans using Generative Agents

Add code
Nov 21, 2024
Viaarxiv icon

The Crucial Role of Samplers in Online Direct Preference Optimization

Add code
Sep 29, 2024
Figure 1 for The Crucial Role of Samplers in Online Direct Preference Optimization
Figure 2 for The Crucial Role of Samplers in Online Direct Preference Optimization
Figure 3 for The Crucial Role of Samplers in Online Direct Preference Optimization
Figure 4 for The Crucial Role of Samplers in Online Direct Preference Optimization
Viaarxiv icon

Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques

Add code
Sep 04, 2024
Figure 1 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Figure 2 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Figure 3 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Figure 4 for Multi-Agent Reinforcement Learning from Human Feedback: Data Coverage and Algorithmic Techniques
Viaarxiv icon

Understanding the Gains from Repeated Self-Distillation

Add code
Jul 05, 2024
Figure 1 for Understanding the Gains from Repeated Self-Distillation
Figure 2 for Understanding the Gains from Repeated Self-Distillation
Figure 3 for Understanding the Gains from Repeated Self-Distillation
Figure 4 for Understanding the Gains from Repeated Self-Distillation
Viaarxiv icon

Toward Global Convergence of Gradient EM for Over-Parameterized Gaussian Mixture Models

Add code
Jun 29, 2024
Viaarxiv icon

Rethinking Transformers in Solving POMDPs

Add code
May 30, 2024
Figure 1 for Rethinking Transformers in Solving POMDPs
Figure 2 for Rethinking Transformers in Solving POMDPs
Figure 3 for Rethinking Transformers in Solving POMDPs
Figure 4 for Rethinking Transformers in Solving POMDPs
Viaarxiv icon