Picture for Miles Turpin

Miles Turpin

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Apr 15, 2024
Viaarxiv icon

Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought

Add code
Mar 08, 2024
Figure 1 for Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
Figure 2 for Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
Figure 3 for Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
Figure 4 for Bias-Augmented Consistency Training Reduces Biased Reasoning in Chain-of-Thought
Viaarxiv icon

Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting

Add code
May 07, 2023
Figure 1 for Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Figure 2 for Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Figure 3 for Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Figure 4 for Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting
Viaarxiv icon