Picture for Peter Hase

Peter Hase

Are language models rational? The case of coherence norms and belief revision

Jun 05, 2024
Viaarxiv icon

LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models

Add code
May 31, 2024
Viaarxiv icon

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Apr 15, 2024
Viaarxiv icon

Rethinking Machine Unlearning for Large Language Models

Feb 15, 2024
Viaarxiv icon

The Unreasonable Effectiveness of Easy Training Data for Hard Tasks

Add code
Jan 12, 2024
Viaarxiv icon

Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks

Add code
Sep 29, 2023
Figure 1 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Figure 2 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Figure 3 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Figure 4 for Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Viaarxiv icon

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Jul 27, 2023
Figure 1 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 2 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 3 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Figure 4 for Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Viaarxiv icon

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind

Add code
Jun 15, 2023
Figure 1 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Figure 2 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Figure 3 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Figure 4 for Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind
Viaarxiv icon

Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects

Add code
Jun 09, 2023
Figure 1 for Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects
Figure 2 for Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects
Figure 3 for Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects
Figure 4 for Adaptive Contextual Perception: How to Generalize to New Backgrounds and Ambiguous Objects
Viaarxiv icon

Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models

Add code
Jan 10, 2023
Figure 1 for Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
Figure 2 for Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
Figure 3 for Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
Figure 4 for Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models
Viaarxiv icon