Picture for Silviu Pitis

Silviu Pitis

Consistent Aggregation of Objectives with Diverse Time Preferences Requires Non-Markovian Rewards

Add code
Sep 30, 2023
Viaarxiv icon

Identifying the Risks of LM Agents with an LM-Emulated Sandbox

Add code
Sep 25, 2023
Figure 1 for Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Figure 2 for Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Figure 3 for Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Figure 4 for Identifying the Risks of LM Agents with an LM-Emulated Sandbox
Viaarxiv icon

Boosted Prompt Ensembles for Large Language Models

Add code
Apr 12, 2023
Figure 1 for Boosted Prompt Ensembles for Large Language Models
Figure 2 for Boosted Prompt Ensembles for Large Language Models
Figure 3 for Boosted Prompt Ensembles for Large Language Models
Figure 4 for Boosted Prompt Ensembles for Large Language Models
Viaarxiv icon

Large Language Models Are Human-Level Prompt Engineers

Add code
Nov 03, 2022
Figure 1 for Large Language Models Are Human-Level Prompt Engineers
Figure 2 for Large Language Models Are Human-Level Prompt Engineers
Figure 3 for Large Language Models Are Human-Level Prompt Engineers
Figure 4 for Large Language Models Are Human-Level Prompt Engineers
Viaarxiv icon

MoCoDA: Model-based Counterfactual Data Augmentation

Add code
Oct 20, 2022
Figure 1 for MoCoDA: Model-based Counterfactual Data Augmentation
Figure 2 for MoCoDA: Model-based Counterfactual Data Augmentation
Figure 3 for MoCoDA: Model-based Counterfactual Data Augmentation
Figure 4 for MoCoDA: Model-based Counterfactual Data Augmentation
Viaarxiv icon

Counterfactual Data Augmentation using Locally Factored Dynamics

Add code
Jul 06, 2020
Figure 1 for Counterfactual Data Augmentation using Locally Factored Dynamics
Figure 2 for Counterfactual Data Augmentation using Locally Factored Dynamics
Figure 3 for Counterfactual Data Augmentation using Locally Factored Dynamics
Figure 4 for Counterfactual Data Augmentation using Locally Factored Dynamics
Viaarxiv icon

Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning

Add code
Jul 06, 2020
Figure 1 for Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
Figure 2 for Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
Figure 3 for Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
Figure 4 for Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
Viaarxiv icon

An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality

Add code
Feb 14, 2020
Figure 1 for An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
Figure 2 for An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
Figure 3 for An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
Figure 4 for An Inductive Bias for Distances: Neural Nets that Respect the Triangle Inequality
Viaarxiv icon

Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes

Add code
Jan 27, 2020
Figure 1 for Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
Figure 2 for Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
Figure 3 for Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
Figure 4 for Objective Social Choice: Using Auxiliary Information to Improve Voting Outcomes
Viaarxiv icon

Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning

Add code
Sep 09, 2019
Figure 1 for Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
Figure 2 for Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
Figure 3 for Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
Figure 4 for Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
Viaarxiv icon