Picture for Jeff Schneider

Jeff Schneider

Carnegie Mellon University

Maximum Likelihood Reinforcement Learning

Add code
Feb 02, 2026
Viaarxiv icon

Continual Policy Distillation from Distributed Reinforcement Learning Teachers

Add code
Jan 30, 2026
Viaarxiv icon

Latent Policy Steering with Embodiment-Agnostic Pretrained World Models

Add code
Jul 17, 2025
Figure 1 for Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Figure 2 for Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Figure 3 for Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Figure 4 for Latent Policy Steering with Embodiment-Agnostic Pretrained World Models
Viaarxiv icon

Multi-Timescale Dynamics Model Bayesian Optimization for Plasma Stabilization in Tokamaks

Add code
Jun 12, 2025
Viaarxiv icon

Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation

Add code
Jun 09, 2025
Viaarxiv icon

Can Large Reasoning Models Self-Train?

Add code
May 27, 2025
Figure 1 for Can Large Reasoning Models Self-Train?
Figure 2 for Can Large Reasoning Models Self-Train?
Figure 3 for Can Large Reasoning Models Self-Train?
Figure 4 for Can Large Reasoning Models Self-Train?
Viaarxiv icon

Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning

Add code
May 19, 2025
Viaarxiv icon

Training a Generally Curious Agent

Add code
Feb 24, 2025
Figure 1 for Training a Generally Curious Agent
Figure 2 for Training a Generally Curious Agent
Figure 3 for Training a Generally Curious Agent
Figure 4 for Training a Generally Curious Agent
Viaarxiv icon

TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint

Add code
Feb 05, 2025
Figure 1 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Figure 2 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Figure 3 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Figure 4 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Viaarxiv icon

Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning

Add code
Oct 15, 2024
Figure 1 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 2 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 3 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 4 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Viaarxiv icon