Alert button
Picture for Masatoshi Uehara

Masatoshi Uehara

Alert button

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization

Jan 12, 2024
Jakub Grudzien Kuba, Masatoshi Uehara, Pieter Abbeel, Sergey Levine

Viaarxiv icon

Source Condition Double Robust Inference on Functionals of Inverse Problems

Jul 25, 2023
Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara

Figure 1 for Source Condition Double Robust Inference on Functionals of Inverse Problems
Figure 2 for Source Condition Double Robust Inference on Functionals of Inverse Problems
Viaarxiv icon

Off-Policy Evaluation of Ranking Policies under Diverse User Behavior

Jun 26, 2023
Haruka Kiyohara, Masatoshi Uehara, Yusuke Narita, Nobuyuki Shimizu, Yasuo Yamamoto, Yuta Saito

Figure 1 for Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Figure 2 for Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Figure 3 for Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Figure 4 for Off-Policy Evaluation of Ranking Policies under Diverse User Behavior
Viaarxiv icon

How to Query Human Feedback Efficiently in RL?

May 29, 2023
Wenhao Zhan, Masatoshi Uehara, Wen Sun, Jason D. Lee

Viaarxiv icon

Provable Offline Reinforcement Learning with Human Feedback

May 24, 2023
Wenhao Zhan, Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun

Viaarxiv icon

Distributional Offline Policy Evaluation with Predictive Error Guarantees

Feb 19, 2023
Runzhe Wu, Masatoshi Uehara, Wen Sun

Figure 1 for Distributional Offline Policy Evaluation with Predictive Error Guarantees
Figure 2 for Distributional Offline Policy Evaluation with Predictive Error Guarantees
Figure 3 for Distributional Offline Policy Evaluation with Predictive Error Guarantees
Figure 4 for Distributional Offline Policy Evaluation with Predictive Error Guarantees
Viaarxiv icon

Minimax Instrumental Variable Regression and $L_2$ Convergence Guarantees without Identification or Closedness

Feb 10, 2023
Andrew Bennett, Nathan Kallus, Xiaojie Mao, Whitney Newey, Vasilis Syrgkanis, Masatoshi Uehara

Figure 1 for Minimax Instrumental Variable Regression and $L_2$ Convergence Guarantees without Identification or Closedness
Viaarxiv icon

Refined Value-Based Offline RL under Realizability and Partial Coverage

Feb 05, 2023
Masatoshi Uehara, Nathan Kallus, Jason D. Lee, Wen Sun

Figure 1 for Refined Value-Based Offline RL under Realizability and Partial Coverage
Viaarxiv icon

A Review of Off-Policy Evaluation in Reinforcement Learning

Dec 13, 2022
Masatoshi Uehara, Chengchun Shi, Nathan Kallus

Viaarxiv icon

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs

Jul 26, 2022
Masatoshi Uehara, Haruka Kiyohara, Andrew Bennett, Victor Chernozhukov, Nan Jiang, Nathan Kallus, Chengchun Shi, Wen Sun

Figure 1 for Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Figure 2 for Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Figure 3 for Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Figure 4 for Future-Dependent Value-Based Off-Policy Evaluation in POMDPs
Viaarxiv icon