Picture for Agam Goyal

Agam Goyal

CausalDetox: Causal Head Selection and Intervention for Language Model Detoxification

Add code
Apr 16, 2026
Viaarxiv icon

Masking or Mitigating? Deconstructing the Impact of Query Rewriting on Retriever Biases in RAG

Add code
Apr 07, 2026
Viaarxiv icon

From Plausible to Causal: Counterfactual Semantics for Policy Evaluation in Simulated Online Communities

Add code
Apr 05, 2026
Viaarxiv icon

Social Simulacra in the Wild: AI Agent Communities on Moltbook

Add code
Mar 17, 2026
Viaarxiv icon

Answer Bubbles: Information Exposure in AI-Mediated Search

Add code
Mar 17, 2026
Viaarxiv icon

The Hidden Toll of Social Media News: Causal Effects on Psychosocial Wellbeing

Add code
Jan 20, 2026
Viaarxiv icon

ArgCMV: An Argument Summarization Benchmark for the LLM-era

Add code
Aug 27, 2025
Figure 1 for ArgCMV: An Argument Summarization Benchmark for the LLM-era
Figure 2 for ArgCMV: An Argument Summarization Benchmark for the LLM-era
Figure 3 for ArgCMV: An Argument Summarization Benchmark for the LLM-era
Figure 4 for ArgCMV: An Argument Summarization Benchmark for the LLM-era
Viaarxiv icon

MoMoE: Mixture of Moderation Experts Framework for AI-Assisted Online Governance

Add code
May 20, 2025
Viaarxiv icon

Breaking Bad Tokens: Detoxification of LLMs Using Sparse Autoencoders

Add code
May 20, 2025
Viaarxiv icon

SLM-Mod: Small Language Models Surpass LLMs at Content Moderation

Add code
Oct 17, 2024
Figure 1 for SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Figure 2 for SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Figure 3 for SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Figure 4 for SLM-Mod: Small Language Models Surpass LLMs at Content Moderation
Viaarxiv icon