Picture for Henry Peng Zou

Henry Peng Zou

Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training

Add code
Feb 24, 2026
Viaarxiv icon

CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn and Multi-Step Agentic Tool Use

Add code
Feb 12, 2026
Viaarxiv icon

TodyComm: Task-Oriented Dynamic Communication for Multi-Round LLM-based Multi-Agent System

Add code
Feb 03, 2026
Viaarxiv icon

From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents

Add code
Jun 23, 2025
Viaarxiv icon

A Call for Collaborative Intelligence: Why Human-Agent Systems Should Precede AI Autonomy

Add code
Jun 11, 2025
Viaarxiv icon

PersonaAgent: When Large Language Model Agents Meet Personalization at Test Time

Add code
Jun 06, 2025
Viaarxiv icon

A Survey on Large Language Model based Human-Agent Systems

Add code
May 01, 2025
Viaarxiv icon

Semi-Supervised In-Context Learning: A Baseline Study

Add code
Mar 04, 2025
Figure 1 for Semi-Supervised In-Context Learning: A Baseline Study
Figure 2 for Semi-Supervised In-Context Learning: A Baseline Study
Figure 3 for Semi-Supervised In-Context Learning: A Baseline Study
Figure 4 for Semi-Supervised In-Context Learning: A Baseline Study
Viaarxiv icon

LLMInit: A Free Lunch from Large Language Models for Selective Initialization of Recommendation

Add code
Mar 03, 2025
Viaarxiv icon

TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency

Add code
Feb 26, 2025
Figure 1 for TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency
Figure 2 for TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency
Figure 3 for TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency
Figure 4 for TestNUC: Enhancing Test-Time Computing Approaches through Neighboring Unlabeled Data Consistency
Viaarxiv icon