Picture for Mrinank Sharma

Mrinank Sharma

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Add code
Jan 17, 2024
Viaarxiv icon

Towards Understanding Sycophancy in Language Models

Add code
Oct 27, 2023
Figure 1 for Towards Understanding Sycophancy in Language Models
Figure 2 for Towards Understanding Sycophancy in Language Models
Figure 3 for Towards Understanding Sycophancy in Language Models
Figure 4 for Towards Understanding Sycophancy in Language Models
Viaarxiv icon

Understanding and Controlling a Maze-Solving Policy Network

Add code
Oct 12, 2023
Figure 1 for Understanding and Controlling a Maze-Solving Policy Network
Figure 2 for Understanding and Controlling a Maze-Solving Policy Network
Figure 3 for Understanding and Controlling a Maze-Solving Policy Network
Figure 4 for Understanding and Controlling a Maze-Solving Policy Network
Viaarxiv icon

Incorporating Unlabelled Data into Bayesian Neural Networks

Add code
Apr 04, 2023
Figure 1 for Incorporating Unlabelled Data into Bayesian Neural Networks
Figure 2 for Incorporating Unlabelled Data into Bayesian Neural Networks
Figure 3 for Incorporating Unlabelled Data into Bayesian Neural Networks
Figure 4 for Incorporating Unlabelled Data into Bayesian Neural Networks
Viaarxiv icon

Do Bayesian Neural Networks Need To Be Fully Stochastic?

Add code
Nov 11, 2022
Figure 1 for Do Bayesian Neural Networks Need To Be Fully Stochastic?
Figure 2 for Do Bayesian Neural Networks Need To Be Fully Stochastic?
Figure 3 for Do Bayesian Neural Networks Need To Be Fully Stochastic?
Figure 4 for Do Bayesian Neural Networks Need To Be Fully Stochastic?
Viaarxiv icon

Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt

Add code
Jun 16, 2022
Figure 1 for Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
Figure 2 for Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
Figure 3 for Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
Figure 4 for Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt
Viaarxiv icon

Prioritized training on points that are learnable, worth learning, and not yet learned

Add code
Jul 06, 2021
Figure 1 for Prioritized training on points that are learnable, worth learning, and not yet learned
Figure 2 for Prioritized training on points that are learnable, worth learning, and not yet learned
Figure 3 for Prioritized training on points that are learnable, worth learning, and not yet learned
Figure 4 for Prioritized training on points that are learnable, worth learning, and not yet learned
Viaarxiv icon

On the robustness of effectiveness estimation of nonpharmaceutical interventions against COVID-19 transmission

Add code
Jul 27, 2020
Figure 1 for On the robustness of effectiveness estimation of nonpharmaceutical interventions against COVID-19 transmission
Figure 2 for On the robustness of effectiveness estimation of nonpharmaceutical interventions against COVID-19 transmission
Figure 3 for On the robustness of effectiveness estimation of nonpharmaceutical interventions against COVID-19 transmission
Figure 4 for On the robustness of effectiveness estimation of nonpharmaceutical interventions against COVID-19 transmission
Viaarxiv icon

Differentially Private Federated Variational Inference

Add code
Nov 24, 2019
Figure 1 for Differentially Private Federated Variational Inference
Figure 2 for Differentially Private Federated Variational Inference
Figure 3 for Differentially Private Federated Variational Inference
Figure 4 for Differentially Private Federated Variational Inference
Viaarxiv icon