Picture for Dhawal Gupta

Dhawal Gupta

ICU-Sepsis: A Benchmark MDP Built from Real Medical Data

Add code
Jun 09, 2024
Figure 1 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Figure 2 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Figure 3 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Figure 4 for ICU-Sepsis: A Benchmark MDP Built from Real Medical Data
Viaarxiv icon

From Past to Future: Rethinking Eligibility Traces

Add code
Dec 20, 2023
Figure 1 for From Past to Future: Rethinking Eligibility Traces
Figure 2 for From Past to Future: Rethinking Eligibility Traces
Figure 3 for From Past to Future: Rethinking Eligibility Traces
Figure 4 for From Past to Future: Rethinking Eligibility Traces
Viaarxiv icon

Behavior Alignment via Reward Function Optimization

Add code
Oct 31, 2023
Figure 1 for Behavior Alignment via Reward Function Optimization
Figure 2 for Behavior Alignment via Reward Function Optimization
Figure 3 for Behavior Alignment via Reward Function Optimization
Figure 4 for Behavior Alignment via Reward Function Optimization
Viaarxiv icon

Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF

Add code
Sep 16, 2023
Figure 1 for Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Figure 2 for Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Figure 3 for Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Figure 4 for Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Viaarxiv icon

Coagent Networks: Generalized and Scaled

Add code
May 16, 2023
Figure 1 for Coagent Networks: Generalized and Scaled
Figure 2 for Coagent Networks: Generalized and Scaled
Figure 3 for Coagent Networks: Generalized and Scaled
Figure 4 for Coagent Networks: Generalized and Scaled
Viaarxiv icon

Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management

Add code
Feb 21, 2023
Figure 1 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Figure 2 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Figure 3 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Figure 4 for Offline Reinforcement Learning for Mixture-of-Expert Dialogue Management
Viaarxiv icon

Gradient Temporal-Difference Learning with Regularized Corrections

Add code
Jul 07, 2020
Figure 1 for Gradient Temporal-Difference Learning with Regularized Corrections
Figure 2 for Gradient Temporal-Difference Learning with Regularized Corrections
Figure 3 for Gradient Temporal-Difference Learning with Regularized Corrections
Figure 4 for Gradient Temporal-Difference Learning with Regularized Corrections
Viaarxiv icon