Picture for Han Zhong

Han Zhong

Zachary

Bringing Value Models Back: Generative Critics for Value Modeling in LLM Reinforcement Learning

Add code
Apr 12, 2026
Viaarxiv icon

Shopping with a Platform AI Assistant: Who Adopts, When in the Journey, and What For

Add code
Mar 26, 2026
Viaarxiv icon

Robust Assortment Optimization from Observational Data

Add code
Feb 11, 2026
Viaarxiv icon

Muon in Associative Memory Learning: Training Dynamics and Scaling Laws

Add code
Feb 05, 2026
Viaarxiv icon

Optimism Stabilizes Thompson Sampling for Adaptive Inference

Add code
Feb 05, 2026
Viaarxiv icon

The Sample Complexity of Online Strategic Decision Making with Information Asymmetry and Knowledge Transportability

Add code
Jun 11, 2025
Viaarxiv icon

Less is More: Improving LLM Alignment via Preference Data Selection

Add code
Feb 22, 2025
Viaarxiv icon

Learning an Optimal Assortment Policy under Observational Data

Add code
Feb 10, 2025
Viaarxiv icon

BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning

Add code
Jan 31, 2025
Figure 1 for BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
Figure 2 for BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
Figure 3 for BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
Figure 4 for BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
Viaarxiv icon

A3S: A General Active Clustering Method with Pairwise Constraints

Add code
Jul 14, 2024
Figure 1 for A3S: A General Active Clustering Method with Pairwise Constraints
Figure 2 for A3S: A General Active Clustering Method with Pairwise Constraints
Figure 3 for A3S: A General Active Clustering Method with Pairwise Constraints
Figure 4 for A3S: A General Active Clustering Method with Pairwise Constraints
Viaarxiv icon