Picture for Mona Diab

Mona Diab

Carnegie Mellon University

SAEs $\textit{Can}$ Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs

Add code
Apr 11, 2025
Viaarxiv icon

CoRAG: Collaborative Retrieval-Augmented Generation

Add code
Apr 02, 2025
Viaarxiv icon

Intrinsic Bias is Predicted by Pretraining Data and Correlates with Downstream Performance in Vision-Language Encoders

Add code
Feb 11, 2025
Viaarxiv icon

Towards Global AI Inclusivity: A Large-Scale Multilingual Terminology Dataset

Add code
Dec 24, 2024
Viaarxiv icon

Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models

Add code
Nov 01, 2024
Figure 1 for Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
Figure 2 for Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
Figure 3 for Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
Figure 4 for Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
Viaarxiv icon

BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data

Add code
Oct 21, 2024
Figure 1 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Figure 2 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Figure 3 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Figure 4 for BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data
Viaarxiv icon

The FIGNEWS Shared Task on News Media Narratives

Add code
Jul 25, 2024
Figure 1 for The FIGNEWS Shared Task on News Media Narratives
Figure 2 for The FIGNEWS Shared Task on News Media Narratives
Figure 3 for The FIGNEWS Shared Task on News Media Narratives
Figure 4 for The FIGNEWS Shared Task on News Media Narratives
Viaarxiv icon

Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Add code
Jun 25, 2024
Viaarxiv icon

Evaluating Large Language Model Biases in Persona-Steered Generation

Add code
May 30, 2024
Figure 1 for Evaluating Large Language Model Biases in Persona-Steered Generation
Figure 2 for Evaluating Large Language Model Biases in Persona-Steered Generation
Figure 3 for Evaluating Large Language Model Biases in Persona-Steered Generation
Figure 4 for Evaluating Large Language Model Biases in Persona-Steered Generation
Viaarxiv icon

Automatic Generation of Model and Data Cards: A Step Towards Responsible AI

Add code
May 10, 2024
Viaarxiv icon