Picture for Sharan Maiya

Sharan Maiya

Ho Wan

Will AI Tell Lies to Save Sick Children? Litmus-Testing AI Values Prioritization with AIRiskDilemmas

Add code
May 20, 2025
Viaarxiv icon

Cluster-norm for Unsupervised Probing of Knowledge

Add code
Jul 26, 2024
Viaarxiv icon