Picture for Zhiqiang Zhang

Zhiqiang Zhang

When Sharpening Becomes Collapse: Sampling Bias and Semantic Coupling in RL with Verifiable Rewards

Add code
Jan 26, 2026
Viaarxiv icon

Token-level Collaborative Alignment for LLM-based Generative Recommendation

Add code
Jan 26, 2026
Viaarxiv icon

MergeMix: Optimizing Mid-Training Data Mixtures via Learnable Model Merging

Add code
Jan 25, 2026
Viaarxiv icon

Rethinking Sample Polarity in Reinforcement Learning with Verifiable Rewards

Add code
Dec 25, 2025
Viaarxiv icon

AesTest: Measuring Aesthetic Intelligence from Perception to Production

Add code
Nov 09, 2025
Viaarxiv icon

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Add code
Oct 21, 2025
Figure 1 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 2 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 3 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 4 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Viaarxiv icon

Arrows of Math Reasoning Data Synthesis for Large Language Models: Diversity, Complexity and Correctness

Add code
Aug 26, 2025
Viaarxiv icon

A Deep Learning Pipeline Using Synthetic Data to Improve Interpretation of Paper ECG Images

Add code
Jul 29, 2025
Viaarxiv icon

Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models

Add code
Jul 24, 2025
Viaarxiv icon

Agentar-Fin-R1: Enhancing Financial Intelligence through Domain Expertise, Training Efficiency, and Advanced Reasoning

Add code
Jul 24, 2025
Viaarxiv icon