Picture for Yi Wu

Yi Wu

Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction

Add code
Aug 12, 2024
Figure 1 for Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction
Figure 2 for Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction
Figure 3 for Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction
Viaarxiv icon

PM-VIS+: High-Performance Video Instance Segmentation without Video Annotation

Add code
Jun 28, 2024
Viaarxiv icon

ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation

Add code
Jun 20, 2024
Figure 1 for ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation
Figure 2 for ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation
Figure 3 for ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation
Figure 4 for ReaLHF: Optimized RLHF Training for Large Language Models through Parameter Reallocation
Viaarxiv icon

FlightBench: A Comprehensive Benchmark of Spatial Planning Methods for Quadrotors

Add code
Jun 09, 2024
Viaarxiv icon

ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios

Add code
May 07, 2024
Figure 1 for ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios
Figure 2 for ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios
Figure 3 for ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios
Figure 4 for ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios
Viaarxiv icon

MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure

Add code
May 01, 2024
Figure 1 for MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure
Figure 2 for MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure
Figure 3 for MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure
Figure 4 for MESA: Cooperative Meta-Exploration in Multi-Agent Learning through Exploiting State-Action Space Structure
Viaarxiv icon

PM-VIS: High-Performance Box-Supervised Video Instance Segmentation

Add code
Apr 22, 2024
Figure 1 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
Figure 2 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
Figure 3 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
Figure 4 for PM-VIS: High-Performance Box-Supervised Video Instance Segmentation
Viaarxiv icon

Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study

Add code
Apr 16, 2024
Figure 1 for Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Figure 2 for Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Figure 3 for Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Figure 4 for Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Viaarxiv icon

Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models

Add code
Apr 08, 2024
Figure 1 for Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Figure 2 for Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Figure 3 for Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Figure 4 for Long-horizon Locomotion and Manipulation on a Quadrupedal Robot with Large Language Models
Viaarxiv icon

Leveraging Symmetry in RL-based Legged Locomotion Control

Add code
Mar 27, 2024
Figure 1 for Leveraging Symmetry in RL-based Legged Locomotion Control
Figure 2 for Leveraging Symmetry in RL-based Legged Locomotion Control
Figure 3 for Leveraging Symmetry in RL-based Legged Locomotion Control
Figure 4 for Leveraging Symmetry in RL-based Legged Locomotion Control
Viaarxiv icon