Picture for Yuanheng Zhu

Yuanheng Zhu

CriticSearch: Fine-Grained Credit Assignment for Search Agents via a Retrospective Critic

Add code
Nov 15, 2025
Viaarxiv icon

ARAC: Adaptive Regularized Multi-Agent Soft Actor-Critic in Graph-Structured Adversarial Games

Add code
Nov 11, 2025
Figure 1 for ARAC: Adaptive Regularized Multi-Agent Soft Actor-Critic in Graph-Structured Adversarial Games
Figure 2 for ARAC: Adaptive Regularized Multi-Agent Soft Actor-Critic in Graph-Structured Adversarial Games
Figure 3 for ARAC: Adaptive Regularized Multi-Agent Soft Actor-Critic in Graph-Structured Adversarial Games
Figure 4 for ARAC: Adaptive Regularized Multi-Agent Soft Actor-Critic in Graph-Structured Adversarial Games
Viaarxiv icon

Empowering Multi-Robot Cooperation via Sequential World Models

Add code
Sep 16, 2025
Viaarxiv icon

SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning

Add code
Jun 24, 2025
Viaarxiv icon

DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy

Add code
Jun 11, 2025
Figure 1 for DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy
Figure 2 for DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy
Figure 3 for DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy
Figure 4 for DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy
Viaarxiv icon

Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement

Add code
Oct 15, 2024
Viaarxiv icon

Discretizing Continuous Action Space with Unimodal Probability Distributions for On-Policy Reinforcement Learning

Add code
Aug 01, 2024
Viaarxiv icon

FM3Q: Factorized Multi-Agent MiniMax Q-Learning for Two-Team Zero-Sum Markov Game

Add code
Feb 01, 2024
Viaarxiv icon

NeuronsMAE: A Novel Multi-Agent Reinforcement Learning Environment for Cooperative and Competitive Multi-Robot Tasks

Add code
Mar 22, 2023
Viaarxiv icon

A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat

Add code
Dec 05, 2022
Figure 1 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Figure 2 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Figure 3 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Figure 4 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Viaarxiv icon