Picture for Aditya Gopalan

Aditya Gopalan

Reliable Policy Iteration: Performance Robustness Across Architecture and Environment Perturbations

Add code
Dec 12, 2025
Viaarxiv icon

Why DPO is a Misspecified Estimator and How to Fix It

Add code
Oct 23, 2025
Viaarxiv icon

Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning

Add code
Jun 08, 2025
Figure 1 for Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning
Figure 2 for Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning
Figure 3 for Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning
Figure 4 for Reliable Critics: Monotonic Improvement and Convergence Guarantees for Reinforcement Learning
Viaarxiv icon

Towards Reliable Alignment: Uncertainty-aware RLHF

Add code
Oct 31, 2024
Figure 1 for Towards Reliable Alignment: Uncertainty-aware RLHF
Figure 2 for Towards Reliable Alignment: Uncertainty-aware RLHF
Figure 3 for Towards Reliable Alignment: Uncertainty-aware RLHF
Figure 4 for Towards Reliable Alignment: Uncertainty-aware RLHF
Viaarxiv icon

Testing the Feasibility of Linear Programs with Bandit Feedback

Add code
Jun 21, 2024
Figure 1 for Testing the Feasibility of Linear Programs with Bandit Feedback
Figure 2 for Testing the Feasibility of Linear Programs with Bandit Feedback
Viaarxiv icon

When are Bandits Robust to Misspecification?

Add code
Oct 13, 2023
Viaarxiv icon

A Unified Framework for Discovering Discrete Symmetries

Add code
Sep 06, 2023
Figure 1 for A Unified Framework for Discovering Discrete Symmetries
Figure 2 for A Unified Framework for Discovering Discrete Symmetries
Figure 3 for A Unified Framework for Discovering Discrete Symmetries
Figure 4 for A Unified Framework for Discovering Discrete Symmetries
Viaarxiv icon

On the Minimax Regret for Linear Bandits in a wide variety of Action Spaces

Add code
Jan 09, 2023
Viaarxiv icon

Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference

Add code
Jul 23, 2022
Figure 1 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Figure 2 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Figure 3 for Exploration in Linear Bandits with Rich Action Sets and its Implications for Inference
Viaarxiv icon

Actor-Critic based Improper Reinforcement Learning

Add code
Jul 19, 2022
Figure 1 for Actor-Critic based Improper Reinforcement Learning
Figure 2 for Actor-Critic based Improper Reinforcement Learning
Figure 3 for Actor-Critic based Improper Reinforcement Learning
Figure 4 for Actor-Critic based Improper Reinforcement Learning
Viaarxiv icon