Picture for Oliver Rausch

Oliver Rausch

Towards Understanding Sycophancy in Language Models

Add code
Oct 27, 2023
Figure 1 for Towards Understanding Sycophancy in Language Models
Figure 2 for Towards Understanding Sycophancy in Language Models
Figure 3 for Towards Understanding Sycophancy in Language Models
Figure 4 for Towards Understanding Sycophancy in Language Models
Viaarxiv icon

Specific versus General Principles for Constitutional AI

Add code
Oct 20, 2023
Figure 1 for Specific versus General Principles for Constitutional AI
Figure 2 for Specific versus General Principles for Constitutional AI
Figure 3 for Specific versus General Principles for Constitutional AI
Figure 4 for Specific versus General Principles for Constitutional AI
Viaarxiv icon

Question Decomposition Improves the Faithfulness of Model-Generated Reasoning

Add code
Jul 25, 2023
Figure 1 for Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Figure 2 for Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Figure 3 for Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Figure 4 for Question Decomposition Improves the Faithfulness of Model-Generated Reasoning
Viaarxiv icon

Measuring Faithfulness in Chain-of-Thought Reasoning

Add code
Jul 17, 2023
Figure 1 for Measuring Faithfulness in Chain-of-Thought Reasoning
Figure 2 for Measuring Faithfulness in Chain-of-Thought Reasoning
Figure 3 for Measuring Faithfulness in Chain-of-Thought Reasoning
Figure 4 for Measuring Faithfulness in Chain-of-Thought Reasoning
Viaarxiv icon

The Capacity for Moral Self-Correction in Large Language Models

Add code
Feb 18, 2023
Figure 1 for The Capacity for Moral Self-Correction in Large Language Models
Figure 2 for The Capacity for Moral Self-Correction in Large Language Models
Figure 3 for The Capacity for Moral Self-Correction in Large Language Models
Figure 4 for The Capacity for Moral Self-Correction in Large Language Models
Viaarxiv icon

Discovering Language Model Behaviors with Model-Written Evaluations

Add code
Dec 19, 2022
Figure 1 for Discovering Language Model Behaviors with Model-Written Evaluations
Figure 2 for Discovering Language Model Behaviors with Model-Written Evaluations
Figure 3 for Discovering Language Model Behaviors with Model-Written Evaluations
Figure 4 for Discovering Language Model Behaviors with Model-Written Evaluations
Viaarxiv icon

A Data-Centric Optimization Framework for Machine Learning

Add code
Oct 20, 2021
Figure 1 for A Data-Centric Optimization Framework for Machine Learning
Figure 2 for A Data-Centric Optimization Framework for Machine Learning
Figure 3 for A Data-Centric Optimization Framework for Machine Learning
Figure 4 for A Data-Centric Optimization Framework for Machine Learning
Viaarxiv icon

Towards Automated Anamnesis Summarization: BERT-based Models for Symptom Extraction

Add code
Nov 03, 2020
Figure 1 for Towards Automated Anamnesis Summarization: BERT-based Models for Symptom Extraction
Figure 2 for Towards Automated Anamnesis Summarization: BERT-based Models for Symptom Extraction
Figure 3 for Towards Automated Anamnesis Summarization: BERT-based Models for Symptom Extraction
Figure 4 for Towards Automated Anamnesis Summarization: BERT-based Models for Symptom Extraction
Viaarxiv icon