Picture for Shinji Ito

Shinji Ito

Reinforcement Learning from Adversarial Preferences in Tabular MDPs

Add code
Jul 15, 2025
Viaarxiv icon

Optimal Regret of Bernoulli Bandits under Global Differential Privacy

Add code
May 08, 2025
Viaarxiv icon

Bandit Max-Min Fair Allocation

Add code
May 08, 2025
Viaarxiv icon

Influential Bandits: Pulling an Arm May Change the Environment

Add code
Apr 11, 2025
Viaarxiv icon

Instance-Dependent Regret Bounds for Learning Two-Player Zero-Sum Games with Bandit Feedback

Add code
Feb 24, 2025
Viaarxiv icon

Data-dependent Bounds with $T$-Optimal Best-of-Both-Worlds Guarantees in Multi-Armed Bandits using Stability-Penalty Matching

Add code
Feb 12, 2025
Viaarxiv icon

Corrupted Learning Dynamics in Games

Add code
Dec 10, 2024
Viaarxiv icon

A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $Θ$ and its Application to Best-of-Both-Worlds

Add code
May 30, 2024
Figure 1 for A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $Θ$ and its Application to Best-of-Both-Worlds
Viaarxiv icon

Learning with Posterior Sampling for Revenue Management under Time-varying Demand

Add code
May 08, 2024
Figure 1 for Learning with Posterior Sampling for Revenue Management under Time-varying Demand
Figure 2 for Learning with Posterior Sampling for Revenue Management under Time-varying Demand
Figure 3 for Learning with Posterior Sampling for Revenue Management under Time-varying Demand
Figure 4 for Learning with Posterior Sampling for Revenue Management under Time-varying Demand
Viaarxiv icon

Online $\mathrm{L}^{ atural}$-Convex Minimization

Add code
Apr 26, 2024
Figure 1 for Online $\mathrm{L}^{ atural}$-Convex Minimization
Figure 2 for Online $\mathrm{L}^{ atural}$-Convex Minimization
Figure 3 for Online $\mathrm{L}^{ atural}$-Convex Minimization
Figure 4 for Online $\mathrm{L}^{ atural}$-Convex Minimization
Viaarxiv icon