Alert button
Picture for Muning Wen

Muning Wen

Alert button

TRAD: Enhancing LLM Agents with Step-Wise Thought Retrieval and Aligned Decision

Mar 10, 2024
Ruiwen Zhou, Yingxuan Yang, Muning Wen, Ying Wen, Wenhao Wang, Chunling Xi, Guoqiang Xu, Yong Yu, Weinan Zhang

Viaarxiv icon

Entropy-Regularized Token-Level Policy Optimization for Large Language Models

Feb 09, 2024
Muning Wen, Cheng Deng, Jun Wang, Weinan Zhang, Ying Wen

Viaarxiv icon

Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training

Sep 29, 2023
Xidong Feng, Ziyu Wan, Muning Wen, Ying Wen, Weinan Zhang, Jun Wang

Figure 1 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 2 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 3 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Figure 4 for Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training
Viaarxiv icon

Large Sequence Models for Sequential Decision-Making: A Survey

Jun 24, 2023
Muning Wen, Runji Lin, Hanjing Wang, Yaodong Yang, Ying Wen, Luo Mai, Jun Wang, Haifeng Zhang, Weinan Zhang

Figure 1 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 2 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 3 for Large Sequence Models for Sequential Decision-Making: A Survey
Figure 4 for Large Sequence Models for Sequential Decision-Making: A Survey
Viaarxiv icon

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

May 30, 2022
Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, Yaodong Yang

Figure 1 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 2 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 3 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Figure 4 for Multi-Agent Reinforcement Learning is a Sequence Modeling Problem
Viaarxiv icon

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks

Dec 20, 2021
Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

Figure 1 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Figure 2 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Figure 3 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Figure 4 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks
Viaarxiv icon

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks

Dec 06, 2021
Linghui Meng, Muning Wen, Yaodong Yang, Chenyang Le, Xiyun Li, Weinan Zhang, Ying Wen, Haifeng Zhang, Jun Wang, Bo Xu

Figure 1 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks
Figure 2 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks
Figure 3 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks
Figure 4 for Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Conquers All StarCraftII Tasks
Viaarxiv icon

Settling the Variance of Multi-Agent Policy Gradients

Aug 20, 2021
Jakub Grudzien Kuba, Muning Wen, Yaodong Yang, Linghui Meng, Shangding Gu, Haifeng Zhang, David Henry Mguni, Jun Wang

Figure 1 for Settling the Variance of Multi-Agent Policy Gradients
Figure 2 for Settling the Variance of Multi-Agent Policy Gradients
Figure 3 for Settling the Variance of Multi-Agent Policy Gradients
Viaarxiv icon

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

Jun 05, 2021
Ming Zhou, Ziyu Wan, Hanjing Wang, Muning Wen, Runzhe Wu, Ying Wen, Yaodong Yang, Weinan Zhang, Jun Wang

Figure 1 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Figure 2 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Figure 3 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Figure 4 for MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning
Viaarxiv icon