Picture for Linjing Li

Linjing Li

PAEC: Position-Aware Entropy Calibration for LLM Reasoning in RLVR

Add code
Jun 07, 2026
Viaarxiv icon

MatMind: A Structure-Activity Knowledge-Driven Generative Foundation Model for Materials Science

Add code
Jun 05, 2026
Viaarxiv icon

Dynamic Dual-Granularity Skill Bank for Agentic RL

Add code
Mar 30, 2026
Viaarxiv icon

Spec-o3: A Tool-Augmented Vision-Language Agent for Rare Celestial Object Candidate Vetting via Automated Spectral Inspection

Add code
Jan 10, 2026
Viaarxiv icon

Uncertainty Unveiled: Can Exposure to More In-context Examples Mitigate Uncertainty for Large Language Models?

Add code
May 27, 2025
Viaarxiv icon

Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning

Add code
May 20, 2025
Viaarxiv icon

Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning

Add code
May 20, 2025
Viaarxiv icon

Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL

Add code
May 16, 2025
Figure 1 for Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Figure 2 for Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Figure 3 for Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Figure 4 for Learning When to Think: Shaping Adaptive Reasoning in R1-Style Models via Multi-Stage RL
Viaarxiv icon

Learning Dynamics in Continual Pre-Training for Large Language Models

Add code
May 12, 2025
Viaarxiv icon

Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation

Add code
Mar 17, 2025
Figure 1 for Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Figure 2 for Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Figure 3 for Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Figure 4 for Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Viaarxiv icon