Picture for Andrew Wagenmaker

Andrew Wagenmaker

Learning Process Rewards via Success Visitation Matching for Efficient RL

Add code
Jun 22, 2026
Viaarxiv icon

Improving Robotic Generalist Policies via Flow Reversal Steering

Add code
Jun 11, 2026
Viaarxiv icon

RoboReward: General-Purpose Vision-Language Reward Models for Robotics

Add code
Jan 08, 2026
Viaarxiv icon

Robust Finetuning of Vision-Language-Action Robot Policies via Parameter Merging

Add code
Dec 18, 2025
Viaarxiv icon

Posterior Behavioral Cloning: Pretraining BC Policies for Efficient RL Finetuning

Add code
Dec 18, 2025
Viaarxiv icon

Active learning of neural population dynamics using two-photon holographic optogenetics

Add code
Dec 03, 2024
Figure 1 for Active learning of neural population dynamics using two-photon holographic optogenetics
Figure 2 for Active learning of neural population dynamics using two-photon holographic optogenetics
Figure 3 for Active learning of neural population dynamics using two-photon holographic optogenetics
Figure 4 for Active learning of neural population dynamics using two-photon holographic optogenetics
Viaarxiv icon

Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL

Add code
Oct 26, 2024
Figure 1 for Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL
Figure 2 for Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL
Figure 3 for Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL
Figure 4 for Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL
Viaarxiv icon

Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification

Add code
Oct 10, 2024
Figure 1 for Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification
Figure 2 for Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification
Figure 3 for Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification
Viaarxiv icon

Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning

Add code
Jun 15, 2024
Figure 1 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Figure 2 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Figure 3 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Figure 4 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Viaarxiv icon

Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning

Add code
Jun 11, 2024
Figure 1 for Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning
Figure 2 for Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning
Figure 3 for Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning
Viaarxiv icon