Picture for Xinwei Xiao

Xinwei Xiao

GEM: Guided Expectation-Maximization for Behavior-Normalized Candidate Action Selection in Offline RL

Add code
Mar 24, 2026
Viaarxiv icon