Picture for Himabindu Lakkaraju

Himabindu Lakkaraju

In-Context Unlearning: Language Models as Few Shot Unlearners

Add code
Oct 12, 2023
Viaarxiv icon

Are Large Language Models Post Hoc Explainers?

Add code
Oct 10, 2023
Figure 1 for Are Large Language Models Post Hoc Explainers?
Figure 2 for Are Large Language Models Post Hoc Explainers?
Figure 3 for Are Large Language Models Post Hoc Explainers?
Figure 4 for Are Large Language Models Post Hoc Explainers?
Viaarxiv icon

On the Trade-offs between Adversarial Robustness and Actionable Explanations

Add code
Sep 28, 2023
Viaarxiv icon

Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage

Add code
Aug 08, 2023
Figure 1 for Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage
Figure 2 for Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage
Figure 3 for Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage
Figure 4 for Accurate, Explainable, and Private Models: Providing Recourse While Minimizing Training Data Leakage
Viaarxiv icon

Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability

Add code
Jul 27, 2023
Figure 1 for Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability
Figure 2 for Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability
Figure 3 for Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability
Figure 4 for Verifiable Feature Attributions: A Bridge between Post Hoc Explainability and Inherent Interpretability
Viaarxiv icon

Efficient Estimation of the Local Robustness of Machine Learning Models

Add code
Jul 26, 2023
Viaarxiv icon

Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions

Add code
Jul 25, 2023
Figure 1 for Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions
Figure 2 for Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions
Figure 3 for Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions
Figure 4 for Analyzing Chain-of-Thought Prompting in Large Language Models via Gradient-based Feature Attributions
Viaarxiv icon

Consistent Explanations in the Face of Model Indeterminacy via Ensembling

Add code
Jun 13, 2023
Figure 1 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling
Figure 2 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling
Figure 3 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling
Figure 4 for Consistent Explanations in the Face of Model Indeterminacy via Ensembling
Viaarxiv icon

On Minimizing the Impact of Dataset Shifts on Actionable Explanations

Add code
Jun 11, 2023
Figure 1 for On Minimizing the Impact of Dataset Shifts on Actionable Explanations
Figure 2 for On Minimizing the Impact of Dataset Shifts on Actionable Explanations
Figure 3 for On Minimizing the Impact of Dataset Shifts on Actionable Explanations
Figure 4 for On Minimizing the Impact of Dataset Shifts on Actionable Explanations
Viaarxiv icon

Word-Level Explanations for Analyzing Bias in Text-to-Image Models

Add code
Jun 03, 2023
Viaarxiv icon