Alert button
Picture for Teun van der Weij

Teun van der Weij

Alert button

Extending Activation Steering to Broad Skills and Multiple Behaviours

Add code
Bookmark button
Alert button
Mar 09, 2024
Teun van der Weij, Massimo Poesio, Nandi Schoots

Figure 1 for Extending Activation Steering to Broad Skills and Multiple Behaviours
Figure 2 for Extending Activation Steering to Broad Skills and Multiple Behaviours
Figure 3 for Extending Activation Steering to Broad Skills and Multiple Behaviours
Figure 4 for Extending Activation Steering to Broad Skills and Multiple Behaviours
Viaarxiv icon

Evaluating Shutdown Avoidance of Language Models in Textual Scenarios

Add code
Bookmark button
Alert button
Jul 03, 2023
Teun van der Weij, Simon Lermen, Leon lang

Viaarxiv icon