Picture for Aldo Pacchiano

Aldo Pacchiano

On the Hardness of Bandit Learning

Add code
Jun 17, 2025
Viaarxiv icon

Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms

Add code
Jun 11, 2025
Viaarxiv icon

Pure Exploration with Feedback Graphs

Add code
Mar 10, 2025
Figure 1 for Pure Exploration with Feedback Graphs
Figure 2 for Pure Exploration with Feedback Graphs
Figure 3 for Pure Exploration with Feedback Graphs
Figure 4 for Pure Exploration with Feedback Graphs
Viaarxiv icon

Language Model Personalization via Reward Factorization

Add code
Mar 08, 2025
Viaarxiv icon

Adaptive Exploration for Multi-Reward Multi-Policy Evaluation

Add code
Feb 04, 2025
Figure 1 for Adaptive Exploration for Multi-Reward Multi-Policy Evaluation
Figure 2 for Adaptive Exploration for Multi-Reward Multi-Policy Evaluation
Figure 3 for Adaptive Exploration for Multi-Reward Multi-Policy Evaluation
Figure 4 for Adaptive Exploration for Multi-Reward Multi-Policy Evaluation
Viaarxiv icon

ORSO: Accelerating Reward Design via Online Reward Selection and Policy Optimization

Add code
Oct 17, 2024
Viaarxiv icon

State-free Reinforcement Learning

Add code
Sep 27, 2024
Figure 1 for State-free Reinforcement Learning
Viaarxiv icon

Second Order Bounds for Contextual Bandits with Function Approximation

Add code
Sep 24, 2024
Viaarxiv icon

Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives

Add code
Aug 07, 2024
Figure 1 for Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives
Figure 2 for Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives
Figure 3 for Learning Rate-Free Reinforcement Learning: A Case for Model Selection with Non-Stationary Objectives
Viaarxiv icon

Provable Interactive Learning with Hindsight Instruction Feedback

Add code
Apr 14, 2024
Figure 1 for Provable Interactive Learning with Hindsight Instruction Feedback
Figure 2 for Provable Interactive Learning with Hindsight Instruction Feedback
Figure 3 for Provable Interactive Learning with Hindsight Instruction Feedback
Figure 4 for Provable Interactive Learning with Hindsight Instruction Feedback
Viaarxiv icon