Picture for Zujie Wen

Zujie Wen

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Add code
Feb 09, 2026
Viaarxiv icon

Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards

Add code
Dec 25, 2025
Viaarxiv icon

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Add code
Oct 21, 2025
Figure 1 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 2 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 3 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 4 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Viaarxiv icon

Enhancing Cross-task Transfer of Large Language Models via Activation Steering

Add code
Jul 17, 2025
Figure 1 for Enhancing Cross-task Transfer of Large Language Models via Activation Steering
Figure 2 for Enhancing Cross-task Transfer of Large Language Models via Activation Steering
Figure 3 for Enhancing Cross-task Transfer of Large Language Models via Activation Steering
Figure 4 for Enhancing Cross-task Transfer of Large Language Models via Activation Steering
Viaarxiv icon

Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs

Add code
Jun 18, 2025
Viaarxiv icon

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Add code
Mar 07, 2025
Figure 1 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 2 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 3 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 4 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Viaarxiv icon

CARE: A Clue-guided Assistant for CSRs to Read User Manuals

Add code
Aug 07, 2024
Figure 1 for CARE: A Clue-guided Assistant for CSRs to Read User Manuals
Figure 2 for CARE: A Clue-guided Assistant for CSRs to Read User Manuals
Figure 3 for CARE: A Clue-guided Assistant for CSRs to Read User Manuals
Figure 4 for CARE: A Clue-guided Assistant for CSRs to Read User Manuals
Viaarxiv icon

Hummer: Towards Limited Competitive Preference Dataset

Add code
May 21, 2024
Viaarxiv icon

Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning

Add code
Mar 11, 2024
Figure 1 for Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning
Figure 2 for Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning
Figure 3 for Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning
Figure 4 for Strength Lies in Differences! Towards Effective Non-collaborative Dialogues via Tailored Strategy Planning
Viaarxiv icon

AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback

Add code
Feb 02, 2024
Figure 1 for AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Figure 2 for AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Figure 3 for AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Figure 4 for AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Viaarxiv icon