Picture for Muning Wen

Muning Wen

MARFT: Multi-Agent Reinforcement Fine-Tuning

Add code
Apr 24, 2025
Viaarxiv icon

A Survey of AI Agent Protocols

Add code
Apr 23, 2025
Viaarxiv icon

Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning

Add code
Feb 27, 2025
Viaarxiv icon

PMAT: Optimizing Action Generation Order in Multi-Agent Reinforcement Learning

Add code
Feb 23, 2025
Viaarxiv icon

Learning Humanoid Standing-up Control across Diverse Postures

Add code
Feb 12, 2025
Viaarxiv icon

HammerBench: Fine-Grained Function-Calling Evaluation in Real Mobile Device Scenarios

Add code
Dec 21, 2024
Viaarxiv icon

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Add code
Oct 12, 2024
Figure 1 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Figure 2 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Figure 3 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Figure 4 for OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Viaarxiv icon

Hammer: Robust Function-Calling for On-Device Language Models via Function Masking

Add code
Oct 06, 2024
Figure 1 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Figure 2 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Figure 3 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Figure 4 for Hammer: Robust Function-Calling for On-Device Language Models via Function Masking
Viaarxiv icon

Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation

Add code
Sep 14, 2024
Viaarxiv icon

P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for Optimizing LLM Training

Add code
Aug 10, 2024
Figure 1 for P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for Optimizing LLM Training
Figure 2 for P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for Optimizing LLM Training
Figure 3 for P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for Optimizing LLM Training
Figure 4 for P3: A Policy-Driven, Pace-Adaptive, and Diversity-Promoted Framework for Optimizing LLM Training
Viaarxiv icon