Picture for Alberto Maria Metelli

Alberto Maria Metelli

Reusing Trajectories in Policy Gradients Enables Fast Convergence

Add code
Jun 06, 2025
Viaarxiv icon

Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes

Add code
Jun 06, 2025
Viaarxiv icon

Catoni-Style Change Point Detection for Regret Minimization in Non-Stationary Heavy-Tailed Bandits

Add code
May 26, 2025
Viaarxiv icon

Thompson Sampling-like Algorithms for Stochastic Rising Bandits

Add code
May 17, 2025
Viaarxiv icon

A Refined Analysis of UCBVI

Add code
Feb 24, 2025
Viaarxiv icon

Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models

Add code
Jan 30, 2025
Viaarxiv icon

On the Partial Identifiability in Reward Learning: Choosing the Best Reward

Add code
Jan 10, 2025
Viaarxiv icon

Statistical Analysis of Policy Space Compression Problem

Add code
Nov 15, 2024
Viaarxiv icon

Rising Rested Bandits: Lower Bounds and Efficient Algorithms

Add code
Nov 06, 2024
Viaarxiv icon

Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs

Add code
Oct 31, 2024
Viaarxiv icon