Picture for Usman Naseem

Usman Naseem

SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization

Add code
Apr 08, 2026
Viaarxiv icon

Over-Refusal and Representation Subspaces: A Mechanistic Analysis of Task-Conditioned Refusal in Aligned LLMs

Add code
Mar 29, 2026
Viaarxiv icon

Can Large Language Models Make Everyone Happy?

Add code
Feb 11, 2026
Viaarxiv icon

From Native Memes to Global Moderation: Cross-Cultural Evaluation of Vision-Language Models for Hateful Meme Detection

Add code
Feb 11, 2026
Viaarxiv icon

Are Aligned Large Language Models Still Misaligned?

Add code
Feb 11, 2026
Viaarxiv icon

From Native Memes to Global Moderation: Cros-Cultural Evaluation of Vision-Language Models for Hateful Meme Detection

Add code
Feb 07, 2026
Viaarxiv icon

When the Model Said 'No Comment', We Knew Helpfulness Was Dead, Honesty Was Alive, and Safety Was Terrified

Add code
Feb 07, 2026
Viaarxiv icon

Do Large Language Models Reflect Demographic Pluralism in Safety?

Add code
Feb 07, 2026
Viaarxiv icon

PersoDPO: Scalable Preference Optimization for Instruction-Adherent, Persona-Grounded Dialogue via Multi-LLM Evaluation

Add code
Feb 04, 2026
Viaarxiv icon

PersoPilot: An Adaptive AI-Copilot for Transparent Contextualized Persona Classification and Personalized Response Generation

Add code
Feb 04, 2026
Viaarxiv icon