Picture for Owen Oertell

Owen Oertell

OfficeQA Pro: An Enterprise Benchmark for End-to-End Grounded Reasoning

Add code
Mar 09, 2026
Viaarxiv icon

KARL: Knowledge Agents via Reinforcement Learning

Add code
Mar 05, 2026
Viaarxiv icon

LLMs Can Learn to Reason Via Off-Policy RL

Add code
Feb 22, 2026
Viaarxiv icon

Scaling Offline RL via Efficient and Expressive Shortcut Models

Add code
May 28, 2025
Figure 1 for Scaling Offline RL via Efficient and Expressive Shortcut Models
Figure 2 for Scaling Offline RL via Efficient and Expressive Shortcut Models
Figure 3 for Scaling Offline RL via Efficient and Expressive Shortcut Models
Figure 4 for Scaling Offline RL via Efficient and Expressive Shortcut Models
Viaarxiv icon

Efficient Controllable Diffusion via Optimal Classifier Guidance

Add code
May 27, 2025
Viaarxiv icon

Convergence Of Consistency Model With Multistep Sampling Under General Data Assumptions

Add code
May 06, 2025
Viaarxiv icon

TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models

Add code
Oct 28, 2024
Figure 1 for TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models
Figure 2 for TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models
Figure 3 for TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models
Figure 4 for TurboHopp: Accelerated Molecule Scaffold Hopping with Consistency Models
Viaarxiv icon

REBEL: Reinforcement Learning via Regressing Relative Rewards

Add code
Apr 25, 2024
Viaarxiv icon

Dataset Reset Policy Optimization for RLHF

Add code
Apr 15, 2024
Figure 1 for Dataset Reset Policy Optimization for RLHF
Figure 2 for Dataset Reset Policy Optimization for RLHF
Figure 3 for Dataset Reset Policy Optimization for RLHF
Figure 4 for Dataset Reset Policy Optimization for RLHF
Viaarxiv icon

RL for Consistency Models: Faster Reward Guided Text-to-Image Generation

Add code
Mar 25, 2024
Figure 1 for RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
Figure 2 for RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
Figure 3 for RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
Figure 4 for RL for Consistency Models: Faster Reward Guided Text-to-Image Generation
Viaarxiv icon