Picture for Jiafei Lyu

Jiafei Lyu

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

Add code
Nov 19, 2025
Viaarxiv icon

PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning

Add code
Nov 14, 2025
Viaarxiv icon

ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning

Add code
May 29, 2025
Viaarxiv icon

Exploration by Random Distribution Distillation

Add code
May 16, 2025
Figure 1 for Exploration by Random Distribution Distillation
Figure 2 for Exploration by Random Distribution Distillation
Figure 3 for Exploration by Random Distribution Distillation
Figure 4 for Exploration by Random Distribution Distillation
Viaarxiv icon

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Add code
Apr 01, 2025
Viaarxiv icon

VLP: Vision-Language Preference Learning for Embodied Manipulation

Add code
Feb 17, 2025
Figure 1 for VLP: Vision-Language Preference Learning for Embodied Manipulation
Figure 2 for VLP: Vision-Language Preference Learning for Embodied Manipulation
Figure 3 for VLP: Vision-Language Preference Learning for Embodied Manipulation
Figure 4 for VLP: Vision-Language Preference Learning for Embodied Manipulation
Viaarxiv icon

Novelty-Guided Data Reuse for Efficient and Diversified Multi-Agent Reinforcement Learning

Add code
Dec 20, 2024
Viaarxiv icon

ODRL: A Benchmark for Off-Dynamics Reinforcement Learning

Add code
Oct 28, 2024
Figure 1 for ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
Figure 2 for ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
Figure 3 for ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
Figure 4 for ODRL: A Benchmark for Off-Dynamics Reinforcement Learning
Viaarxiv icon

A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning

Add code
Oct 18, 2024
Figure 1 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
Figure 2 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
Figure 3 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
Figure 4 for A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning
Viaarxiv icon

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning

Add code
Aug 23, 2024
Figure 1 for SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning
Figure 2 for SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning
Figure 3 for SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning
Figure 4 for SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning
Viaarxiv icon