Picture for Akifumi Wachi

Akifumi Wachi

Stepwise Alignment for Constrained Language Model Policy Optimization

Add code
Apr 17, 2024
Figure 1 for Stepwise Alignment for Constrained Language Model Policy Optimization
Figure 2 for Stepwise Alignment for Constrained Language Model Policy Optimization
Figure 3 for Stepwise Alignment for Constrained Language Model Policy Optimization
Figure 4 for Stepwise Alignment for Constrained Language Model Policy Optimization
Viaarxiv icon

A Survey of Constraint Formulations in Safe Reinforcement Learning

Add code
Feb 03, 2024
Viaarxiv icon

Long-term Safe Reinforcement Learning with Binary Feedback

Add code
Jan 11, 2024
Figure 1 for Long-term Safe Reinforcement Learning with Binary Feedback
Figure 2 for Long-term Safe Reinforcement Learning with Binary Feedback
Figure 3 for Long-term Safe Reinforcement Learning with Binary Feedback
Figure 4 for Long-term Safe Reinforcement Learning with Binary Feedback
Viaarxiv icon

Verbosity Bias in Preference Labeling by Large Language Models

Add code
Oct 16, 2023
Viaarxiv icon

Safe Exploration in Reinforcement Learning: A Generalized Formulation and Algorithms

Add code
Oct 05, 2023
Viaarxiv icon

Safe Policy Optimization with Local Generalized Linear Function Approximations

Add code
Nov 09, 2021
Figure 1 for Safe Policy Optimization with Local Generalized Linear Function Approximations
Figure 2 for Safe Policy Optimization with Local Generalized Linear Function Approximations
Figure 3 for Safe Policy Optimization with Local Generalized Linear Function Approximations
Figure 4 for Safe Policy Optimization with Local Generalized Linear Function Approximations
Viaarxiv icon

LOA: Logical Optimal Actions for Text-based Interaction Games

Add code
Oct 21, 2021
Figure 1 for LOA: Logical Optimal Actions for Text-based Interaction Games
Figure 2 for LOA: Logical Optimal Actions for Text-based Interaction Games
Figure 3 for LOA: Logical Optimal Actions for Text-based Interaction Games
Figure 4 for LOA: Logical Optimal Actions for Text-based Interaction Games
Viaarxiv icon

Neuro-Symbolic Reinforcement Learning with First-Order Logic

Add code
Oct 21, 2021
Figure 1 for Neuro-Symbolic Reinforcement Learning with First-Order Logic
Figure 2 for Neuro-Symbolic Reinforcement Learning with First-Order Logic
Figure 3 for Neuro-Symbolic Reinforcement Learning with First-Order Logic
Viaarxiv icon

Reinforcement Learning with External Knowledge by using Logical Neural Networks

Add code
Mar 03, 2021
Figure 1 for Reinforcement Learning with External Knowledge by using Logical Neural Networks
Figure 2 for Reinforcement Learning with External Knowledge by using Logical Neural Networks
Viaarxiv icon

Q-learning with Language Model for Edit-based Unsupervised Summarization

Add code
Oct 09, 2020
Figure 1 for Q-learning with Language Model for Edit-based Unsupervised Summarization
Figure 2 for Q-learning with Language Model for Edit-based Unsupervised Summarization
Figure 3 for Q-learning with Language Model for Edit-based Unsupervised Summarization
Figure 4 for Q-learning with Language Model for Edit-based Unsupervised Summarization
Viaarxiv icon