Picture for Jing-Cheng Pang

Jing-Cheng Pang

VLGOR: Visual-Language Knowledge Guided Offline Reinforcement Learning for Generalizable Agents

Add code
Mar 24, 2026
Viaarxiv icon

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution

Add code
Mar 21, 2026
Viaarxiv icon

Reinforcement Learning with Promising Tokens for Large Language Models

Add code
Feb 03, 2026
Viaarxiv icon

EDCO: Dynamic Curriculum Orchestration for Domain-specific Large Language Model Fine-tuning

Add code
Jan 07, 2026
Viaarxiv icon

ReLAM: Learning Anticipation Model for Rewarding Visual Robotic Manipulation

Add code
Sep 26, 2025
Viaarxiv icon

ImagineBench: Evaluating Reinforcement Learning with Large Language Model Rollouts

Add code
May 15, 2025
Viaarxiv icon

Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning

Add code
Oct 26, 2024
Figure 1 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Figure 2 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Figure 3 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Figure 4 for Beyond Simple Sum of Delayed Rewards: Non-Markovian Reward Modeling for Reinforcement Learning
Viaarxiv icon

Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts

Add code
Apr 14, 2024
Figure 1 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Figure 2 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Figure 3 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Figure 4 for Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
Viaarxiv icon

Empowering Language Models with Active Inquiry for Deeper Understanding

Add code
Feb 06, 2024
Viaarxiv icon

Language Model Self-improvement by Reinforcement Learning Contemplation

Add code
May 23, 2023
Figure 1 for Language Model Self-improvement by Reinforcement Learning Contemplation
Figure 2 for Language Model Self-improvement by Reinforcement Learning Contemplation
Figure 3 for Language Model Self-improvement by Reinforcement Learning Contemplation
Figure 4 for Language Model Self-improvement by Reinforcement Learning Contemplation
Viaarxiv icon