Picture for Gautam Siddharth Kashyap

Gautam Siddharth Kashyap

Are Aligned Large Language Models Still Misaligned?

Add code
Feb 11, 2026
Viaarxiv icon

Can Large Language Models Make Everyone Happy?

Add code
Feb 11, 2026
Viaarxiv icon

When the Model Said 'No Comment', We Knew Helpfulness Was Dead, Honesty Was Alive, and Safety Was Terrified

Add code
Feb 07, 2026
Viaarxiv icon

Do Large Language Models Reflect Demographic Pluralism in Safety?

Add code
Feb 07, 2026
Viaarxiv icon

They Said Memes Were Harmless-We Found the Ones That Hurt: Decoding Jokes, Symbols, and Cultural References

Add code
Feb 03, 2026
Viaarxiv icon

Revealing the Truth with ConLLM for Detecting Multi-Modal Deepfakes

Add code
Jan 24, 2026
Viaarxiv icon

Do Clinical Question Answering Systems Really Need Specialised Medical Fine Tuning?

Add code
Jan 19, 2026
Viaarxiv icon

We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong

Add code
Sep 26, 2025
Figure 1 for We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
Figure 2 for We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
Figure 3 for We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
Figure 4 for We Think, Therefore We Align LLMs to Helpful, Harmless and Honest Before They Go Wrong
Viaarxiv icon

MAGIC-Enhanced Keyword Prompting for Zero-Shot Audio Captioning with CLIP Models

Add code
Sep 16, 2025
Viaarxiv icon

Too Helpful, Too Harmless, Too Honest or Just Right?

Add code
Sep 10, 2025
Viaarxiv icon