Picture for Imad Aouali

Imad Aouali

Unified PAC-Bayesian Study of Pessimism for Offline Policy Learning with Regularized Importance Sampling

Add code
Jun 05, 2024
Viaarxiv icon

Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning

Add code
May 23, 2024
Figure 1 for Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
Figure 2 for Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
Figure 3 for Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
Figure 4 for Logarithmic Smoothing for Pessimistic Off-Policy Evaluation, Selection and Learning
Viaarxiv icon

Bayesian Off-Policy Evaluation and Learning for Large Action Spaces

Add code
Feb 22, 2024
Figure 1 for Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Figure 2 for Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Figure 3 for Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Figure 4 for Bayesian Off-Policy Evaluation and Learning for Large Action Spaces
Viaarxiv icon

Diffusion Models Meet Contextual Bandits with Large Action Spaces

Add code
Feb 15, 2024
Figure 1 for Diffusion Models Meet Contextual Bandits with Large Action Spaces
Figure 2 for Diffusion Models Meet Contextual Bandits with Large Action Spaces
Figure 3 for Diffusion Models Meet Contextual Bandits with Large Action Spaces
Figure 4 for Diffusion Models Meet Contextual Bandits with Large Action Spaces
Viaarxiv icon

Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits

Add code
Feb 08, 2024
Figure 1 for Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
Figure 2 for Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
Figure 3 for Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
Figure 4 for Prior-Dependent Allocations for Bayesian Fixed-Budget Best-Arm Identification in Structured Bandits
Viaarxiv icon

Exponential Smoothing for Off-Policy Learning

Add code
May 25, 2023
Figure 1 for Exponential Smoothing for Off-Policy Learning
Figure 2 for Exponential Smoothing for Off-Policy Learning
Figure 3 for Exponential Smoothing for Off-Policy Learning
Figure 4 for Exponential Smoothing for Off-Policy Learning
Viaarxiv icon

Offline Evaluation of Reward-Optimizing Recommender Systems: The Case of Simulation

Add code
Sep 18, 2022
Viaarxiv icon

A Scalable Probabilistic Model for Reward Optimizing Slate Recommendation

Add code
Aug 10, 2022
Figure 1 for A Scalable Probabilistic Model for Reward Optimizing Slate Recommendation
Figure 2 for A Scalable Probabilistic Model for Reward Optimizing Slate Recommendation
Figure 3 for A Scalable Probabilistic Model for Reward Optimizing Slate Recommendation
Figure 4 for A Scalable Probabilistic Model for Reward Optimizing Slate Recommendation
Viaarxiv icon

Generalizing Hierarchical Bayesian Bandits

Add code
May 30, 2022
Figure 1 for Generalizing Hierarchical Bayesian Bandits
Figure 2 for Generalizing Hierarchical Bayesian Bandits
Figure 3 for Generalizing Hierarchical Bayesian Bandits
Figure 4 for Generalizing Hierarchical Bayesian Bandits
Viaarxiv icon

Combining Reward and Rank Signals for Slate Recommendation

Add code
Jul 29, 2021
Figure 1 for Combining Reward and Rank Signals for Slate Recommendation
Figure 2 for Combining Reward and Rank Signals for Slate Recommendation
Figure 3 for Combining Reward and Rank Signals for Slate Recommendation
Figure 4 for Combining Reward and Rank Signals for Slate Recommendation
Viaarxiv icon