Picture for Haoyang Hong

Haoyang Hong

When Can You Poison Rewards? A Tight Characterization of Reward Poisoning in Linear MDPs

Add code
Apr 11, 2026
Viaarxiv icon

Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO

Add code
Nov 18, 2025
Figure 1 for Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Figure 2 for Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Figure 3 for Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Figure 4 for Multi-Agent Deep Research: Training Multi-Agent Systems with M-GRPO
Viaarxiv icon