Picture for Hannah Cyberey

Hannah Cyberey

Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control

Add code
Apr 23, 2025
Viaarxiv icon

Sensing and Steering Stereotypes: Extracting and Applying Gender Representation Vectors in LLMs

Add code
Feb 27, 2025
Viaarxiv icon