Picture for Peter Hase

Peter Hase

System-1.x: Learning to Balance Fast and Slow Planning with Language Models

Add code
Jul 19, 2024
Viaarxiv icon

Fundamental Problems With Model Editing: How Should Rational Belief Revision Work in LLMs?

Add code
Jun 27, 2024
Viaarxiv icon

Are language models rational? The case of coherence norms and belief revision

Add code
Jun 05, 2024
Viaarxiv icon

LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models

Add code
May 31, 2024
Viaarxiv icon

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Apr 15, 2024
Viaarxiv icon

Rethinking Machine Unlearning for Large Language Models

Add code
Feb 15, 2024
Figure 1 for Rethinking Machine Unlearning for Large Language Models
Figure 2 for Rethinking Machine Unlearning for Large Language Models
Viaarxiv icon

The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

Add code
Jan 12, 2024
Viaarxiv icon

Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks

Add code
Sep 29, 2023
Figure 1 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Figure 2 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Figure 3 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Figure 4 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Viaarxiv icon

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Add code
Jul 27, 2023
Figure 1 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 2 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 3 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 4 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Viaarxiv icon

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind

Add code
Jun 15, 2023
Figure 1 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Figure 2 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Figure 3 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Figure 4 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Viaarxiv icon