Picture for Hongyao Tang

Hongyao Tang

Embodied Arena: A Comprehensive, Unified, and Evolving Evaluation Platform for Embodied AI

Add code
Sep 18, 2025
Viaarxiv icon

Squeeze the Soaked Sponge: Efficient Off-policy Reinforcement Finetuning for Large Language Model

Add code
Jul 09, 2025
Viaarxiv icon

Can We Optimize Deep RL Policy Weights as Trajectory Modeling?

Add code
Mar 06, 2025
Figure 1 for Can We Optimize Deep RL Policy Weights as Trajectory Modeling?
Figure 2 for Can We Optimize Deep RL Policy Weights as Trajectory Modeling?
Viaarxiv icon

Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer

Add code
Feb 04, 2025
Figure 1 for Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer
Figure 2 for Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer
Figure 3 for Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer
Figure 4 for Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer
Viaarxiv icon

Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn

Add code
Sep 07, 2024
Figure 1 for Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
Figure 2 for Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
Figure 3 for Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
Figure 4 for Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
Viaarxiv icon

MFE-ETP: A Comprehensive Evaluation Benchmark for Multi-modal Foundation Models on Embodied Task Planning

Add code
Jul 06, 2024
Viaarxiv icon

Bridging Evolutionary Algorithms and Reinforcement Learning: A Comprehensive Survey

Add code
Jan 22, 2024
Viaarxiv icon

The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting

Add code
Mar 02, 2023
Figure 1 for The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Figure 2 for The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Figure 3 for The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Figure 4 for The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting
Viaarxiv icon

State-Aware Proximal Pessimistic Algorithms for Offline Reinforcement Learning

Add code
Nov 28, 2022
Viaarxiv icon

ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation

Add code
Oct 26, 2022
Figure 1 for ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation
Figure 2 for ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation
Figure 3 for ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation
Figure 4 for ERL-Re$^2$: Efficient Evolutionary Reinforcement Learning with Shared State Representation and Individual Policy Representation
Viaarxiv icon