Picture for Jingang Wang

Jingang Wang

Rethinking the Sampling Criteria in Reinforcement Learning for LLM Reasoning: A Competence-Difficulty Alignment Perspective

Add code
May 23, 2025
Viaarxiv icon

Too Consistent to Detect: A Study of Self-Consistent Errors in LLMs

Add code
May 23, 2025
Viaarxiv icon

Two Minds Better Than One: Collaborative Reward Modeling for LLM Alignment

Add code
May 19, 2025
Viaarxiv icon

Dynamic Fisher-weighted Model Merging via Bayesian Optimization

Add code
Apr 26, 2025
Viaarxiv icon

NeedleInATable: Exploring Long-Context Capability of Large Language Models towards Long-Structured Tables

Add code
Apr 09, 2025
Viaarxiv icon

Investigating and Scaling up Code-Switching for Multilingual Language Model Pre-Training

Add code
Apr 02, 2025
Viaarxiv icon

SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity

Add code
Mar 03, 2025
Viaarxiv icon

Earlier Tokens Contribute More: Learning Direct Preference Optimization From Temporal Decay Perspective

Add code
Feb 20, 2025
Viaarxiv icon

FRAMES: Boosting LLMs with A Four-Quadrant Multi-Stage Pretraining Strategy

Add code
Feb 08, 2025
Viaarxiv icon

FIRE: Flexible Integration of Data Quality Ratings for Effective Pre-Training

Add code
Feb 02, 2025
Viaarxiv icon