Picture for Jasmine Xinze Li

Jasmine Xinze Li

Scaling laws for activation steering with Llama 2 models and refusal mechanisms

Add code
Jul 15, 2025
Viaarxiv icon

ProgressGym: Alignment with a Millennium of Moral Progress

Add code
Jun 28, 2024
Figure 1 for ProgressGym: Alignment with a Millennium of Moral Progress
Figure 2 for ProgressGym: Alignment with a Millennium of Moral Progress
Figure 3 for ProgressGym: Alignment with a Millennium of Moral Progress
Figure 4 for ProgressGym: Alignment with a Millennium of Moral Progress
Viaarxiv icon