Picture for Oscar Obeso

Oscar Obeso

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Add code
Nov 21, 2024
Figure 1 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Figure 2 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Figure 3 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Figure 4 for Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models
Viaarxiv icon

Refusal in Language Models Is Mediated by a Single Direction

Add code
Jun 17, 2024
Figure 1 for Refusal in Language Models Is Mediated by a Single Direction
Figure 2 for Refusal in Language Models Is Mediated by a Single Direction
Figure 3 for Refusal in Language Models Is Mediated by a Single Direction
Figure 4 for Refusal in Language Models Is Mediated by a Single Direction
Viaarxiv icon