Picture for Chaowen Hu

Chaowen Hu

From $\boldsymbol{\logπ}$ to $\boldsymbolπ$: Taming Divergence in Soft Clipping via Bilateral Decoupled Decay of Probability Gradient Weight

Add code
Mar 15, 2026
Viaarxiv icon

How to Allocate, How to Learn? Dynamic Rollout Allocation and Advantage Modulation for Policy Optimization

Add code
Feb 22, 2026
Viaarxiv icon

MASPO: Unifying Gradient Utilization, Probability Mass, and Signal Reliability for Robust and Sample-Efficient LLM Reasoning

Add code
Feb 19, 2026
Viaarxiv icon

SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction

Add code
Mar 03, 2024
Figure 1 for SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction
Figure 2 for SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction
Figure 3 for SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction
Figure 4 for SERVAL: Synergy Learning between Vertical Models and LLMs towards Oracle-Level Zero-shot Medical Prediction
Viaarxiv icon

Sample-efficient Multi-objective Molecular Optimization with GFlowNets

Add code
Feb 08, 2023
Figure 1 for Sample-efficient Multi-objective Molecular Optimization with GFlowNets
Figure 2 for Sample-efficient Multi-objective Molecular Optimization with GFlowNets
Figure 3 for Sample-efficient Multi-objective Molecular Optimization with GFlowNets
Figure 4 for Sample-efficient Multi-objective Molecular Optimization with GFlowNets
Viaarxiv icon