Picture for Alberto Maria Metelli

Alberto Maria Metelli

Reusing Trajectories in Policy Gradients Enables Fast Convergence

Add code
Jun 06, 2025
Figure 1 for Reusing Trajectories in Policy Gradients Enables Fast Convergence
Figure 2 for Reusing Trajectories in Policy Gradients Enables Fast Convergence
Figure 3 for Reusing Trajectories in Policy Gradients Enables Fast Convergence
Figure 4 for Reusing Trajectories in Policy Gradients Enables Fast Convergence
Viaarxiv icon

Learning Deterministic Policies with Policy Gradients in Constrained Markov Decision Processes

Add code
Jun 06, 2025
Viaarxiv icon

Catoni-Style Change Point Detection for Regret Minimization in Non-Stationary Heavy-Tailed Bandits

Add code
May 26, 2025
Viaarxiv icon

Thompson Sampling-like Algorithms for Stochastic Rising Bandits

Add code
May 17, 2025
Viaarxiv icon

A Refined Analysis of UCBVI

Add code
Feb 24, 2025
Viaarxiv icon

Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models

Add code
Jan 30, 2025
Figure 1 for Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models
Figure 2 for Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models
Figure 3 for Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models
Figure 4 for Achieving $\widetilde{\mathcal{O}}(\sqrt{T})$ Regret in Average-Reward POMDPs with Known Observation Models
Viaarxiv icon

On the Partial Identifiability in Reward Learning: Choosing the Best Reward

Add code
Jan 10, 2025
Figure 1 for On the Partial Identifiability in Reward Learning: Choosing the Best Reward
Figure 2 for On the Partial Identifiability in Reward Learning: Choosing the Best Reward
Figure 3 for On the Partial Identifiability in Reward Learning: Choosing the Best Reward
Figure 4 for On the Partial Identifiability in Reward Learning: Choosing the Best Reward
Viaarxiv icon

Statistical Analysis of Policy Space Compression Problem

Add code
Nov 15, 2024
Viaarxiv icon

Rising Rested Bandits: Lower Bounds and Efficient Algorithms

Add code
Nov 06, 2024
Viaarxiv icon

Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs

Add code
Oct 31, 2024
Viaarxiv icon