Picture for Jiafei Lyu

Jiafei Lyu

UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience

Add code
Mar 25, 2026
Viaarxiv icon

Model-based Offline RL via Robust Value-Aware Model Learning with Implicitly Differentiable Adaptive Weighting

Add code
Mar 09, 2026
Viaarxiv icon

Temporal Difference Learning with Constrained Initial Representations

Add code
Feb 12, 2026
Viaarxiv icon

ProAct: Agentic Lookahead in Interactive Environments

Add code
Feb 05, 2026
Viaarxiv icon

Cross-Domain Offline Policy Adaptation via Selective Transition Correction

Add code
Feb 05, 2026
Viaarxiv icon

EntroPIC: Towards Stable Long-Term Training of LLMs via Entropy Stabilization with Proportional-Integral Control

Add code
Nov 19, 2025
Viaarxiv icon

PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning

Add code
Nov 14, 2025
Figure 1 for PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
Figure 2 for PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
Figure 3 for PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
Figure 4 for PROF: An LLM-based Reward Code Preference Optimization Framework for Offline Imitation Learning
Viaarxiv icon

ADG: Ambient Diffusion-Guided Dataset Recovery for Corruption-Robust Offline Reinforcement Learning

Add code
May 29, 2025
Viaarxiv icon

Exploration by Random Distribution Distillation

Add code
May 16, 2025
Figure 1 for Exploration by Random Distribution Distillation
Figure 2 for Exploration by Random Distribution Distillation
Figure 3 for Exploration by Random Distribution Distillation
Figure 4 for Exploration by Random Distribution Distillation
Viaarxiv icon

GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning

Add code
Apr 01, 2025
Viaarxiv icon