Picture for Fuzheng Zhang

Fuzheng Zhang

Kuaishou Natural Language Processing Center and Audio Center

Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization

Add code
Aug 12, 2025
Viaarxiv icon

AR-GRPO: Training Autoregressive Image Generation Models via Reinforcement Learning

Add code
Aug 09, 2025
Viaarxiv icon

RLEP: Reinforcement Learning with Experience Replay for LLM Reasoning

Add code
Jul 10, 2025
Viaarxiv icon

Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search

Add code
Jun 11, 2025
Viaarxiv icon

DynTok: Dynamic Compression of Visual Tokens for Efficient and Effective Video Understanding

Add code
Jun 04, 2025
Viaarxiv icon

Towards Reward Fairness in RLHF: From a Resource Allocation Perspective

Add code
May 29, 2025
Viaarxiv icon

What Makes a Good Reasoning Chain? Uncovering Structural Patterns in Long Chain-of-Thought Reasoning

Add code
May 28, 2025
Viaarxiv icon

Modality Curation: Building Universal Embeddings for Advanced Multimodal Information Retrieval

Add code
May 26, 2025
Viaarxiv icon

TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos

Add code
May 26, 2025
Viaarxiv icon

Clapper: Compact Learning and Video Representation in VLMs

Add code
May 21, 2025
Viaarxiv icon