Alert button
Picture for Soichiro Nishimori

Soichiro Nishimori

Alert button

Leveraging Domain-Unlabeled Data in Offline Reinforcement Learning across Two Domains

Add code
Bookmark button
Alert button
Apr 11, 2024
Soichiro Nishimori, Xin-Qiang Cai, Johannes Ackermann, Masashi Sugiyama

Viaarxiv icon

A Policy Gradient Primal-Dual Algorithm for Constrained MDPs with Uniform PAC Guarantees

Add code
Bookmark button
Alert button
Feb 02, 2024
Toshinori Kitamura, Tadashi Kozuno, Masahiro Kato, Yuki Ichihara, Soichiro Nishimori, Akiyoshi Sannai, Sho Sonoda, Wataru Kumagai, Yutaka Matsuo

Viaarxiv icon

End-to-End Policy Gradient Method for POMDPs and Explainable Agents

Add code
Bookmark button
Alert button
Apr 19, 2023
Soichiro Nishimori, Sotetsu Koyamada, Shin Ishii

Figure 1 for End-to-End Policy Gradient Method for POMDPs and Explainable Agents
Figure 2 for End-to-End Policy Gradient Method for POMDPs and Explainable Agents
Figure 3 for End-to-End Policy Gradient Method for POMDPs and Explainable Agents
Figure 4 for End-to-End Policy Gradient Method for POMDPs and Explainable Agents
Viaarxiv icon

Pgx: Hardware-accelerated parallel game simulation for reinforcement learning

Add code
Bookmark button
Alert button
Mar 29, 2023
Sotetsu Koyamada, Shinri Okano, Soichiro Nishimori, Yu Murata, Keigo Habara, Haruka Kita, Shin Ishii

Figure 1 for Pgx: Hardware-accelerated parallel game simulation for reinforcement learning
Figure 2 for Pgx: Hardware-accelerated parallel game simulation for reinforcement learning
Viaarxiv icon