Picture for Guoxi Zhang

Guoxi Zhang

SPHERE: Mitigating the Loss of Spectral Plasticity in Mixture-of-Experts for Deep Reinforcement Learning

Add code
May 06, 2026
Viaarxiv icon

Hypothesis Graph Refinement: Hypothesis-Driven Exploration with Cascade Error Correction for Embodied Navigation

Add code
Apr 05, 2026
Viaarxiv icon

Stable Reasoning, Unstable Responses: Mitigating LLM Deception via Stability Asymmetry

Add code
Mar 27, 2026
Viaarxiv icon

VISA: Value Injection via Shielded Adaptation for Personalized LLM Alignment

Add code
Mar 05, 2026
Viaarxiv icon

MVR: Multi-view Video Reward Shaping for Reinforcement Learning

Add code
Mar 02, 2026
Viaarxiv icon

A Game-Theoretic Negotiation Framework for Cross-Cultural Consensus in LLMs

Add code
Jun 16, 2025
Viaarxiv icon

VickreyFeedback: Cost-efficient Data Construction for Reinforcement Learning from Human Feedback

Add code
Sep 27, 2024
Viaarxiv icon

INSIGHT: End-to-End Neuro-Symbolic Visual Reinforcement Learning with Language Explanations

Add code
Mar 19, 2024
Viaarxiv icon

Online Policy Learning from Offline Preferences

Add code
Mar 15, 2024
Figure 1 for Online Policy Learning from Offline Preferences
Figure 2 for Online Policy Learning from Offline Preferences
Figure 3 for Online Policy Learning from Offline Preferences
Figure 4 for Online Policy Learning from Offline Preferences
Viaarxiv icon

Estimating Treatment Effects Under Heterogeneous Interference

Add code
Sep 25, 2023
Viaarxiv icon