Picture for Jeff Schneider

Jeff Schneider

Carnegie Mellon University

Accelerating Diffusion Models in Offline RL via Reward-Aware Consistency Trajectory Distillation

Add code
Jun 09, 2025
Viaarxiv icon

Can Large Reasoning Models Self-Train?

Add code
May 27, 2025
Viaarxiv icon

Policy-Driven World Model Adaptation for Robust Offline Model-based Reinforcement Learning

Add code
May 19, 2025
Viaarxiv icon

Training a Generally Curious Agent

Add code
Feb 24, 2025
Viaarxiv icon

TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint

Add code
Feb 05, 2025
Figure 1 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Figure 2 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Figure 3 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Figure 4 for TD-M(PC)$^2$: Improving Temporal Difference MPC Through Policy Constraint
Viaarxiv icon

Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning

Add code
Oct 15, 2024
Figure 1 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 2 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 3 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 4 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Viaarxiv icon

Decentralized Uncertainty-Aware Active Search with a Team of Aerial Robots

Add code
Oct 11, 2024
Viaarxiv icon

Measure Preserving Flows for Ergodic Search in Convoluted Environments

Add code
Sep 13, 2024
Viaarxiv icon

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

Add code
Sep 02, 2024
Figure 1 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Figure 2 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Figure 3 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Figure 4 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Viaarxiv icon

Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization

Add code
Aug 08, 2024
Viaarxiv icon