Picture for Urja Pawar

Urja Pawar

Moral Preferences of LLMs Under Directed Contextual Influence

Add code
Feb 26, 2026
Viaarxiv icon

Detecting High-Stakes Interactions with Activation Probes

Add code
Jun 12, 2025
Figure 1 for Detecting High-Stakes Interactions with Activation Probes
Figure 2 for Detecting High-Stakes Interactions with Activation Probes
Figure 3 for Detecting High-Stakes Interactions with Activation Probes
Figure 4 for Detecting High-Stakes Interactions with Activation Probes
Viaarxiv icon