Picture for Himabindu Lakkaraju

Himabindu Lakkaraju

Manipulating Large Language Models to Increase Product Visibility

Add code
Apr 11, 2024
Viaarxiv icon

Data Poisoning Attacks on Off-Policy Policy Evaluation Methods

Add code
Apr 06, 2024
Viaarxiv icon

Towards Safe and Aligned Large Language Models for Medicine

Add code
Mar 06, 2024
Figure 1 for Towards Safe and Aligned Large Language Models for Medicine
Figure 2 for Towards Safe and Aligned Large Language Models for Medicine
Figure 3 for Towards Safe and Aligned Large Language Models for Medicine
Figure 4 for Towards Safe and Aligned Large Language Models for Medicine
Viaarxiv icon

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Add code
Feb 27, 2024
Viaarxiv icon

Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)

Add code
Feb 16, 2024
Figure 1 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Figure 2 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Figure 3 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Figure 4 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Viaarxiv icon

Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability

Add code
Feb 16, 2024
Figure 1 for Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability
Figure 2 for Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability
Viaarxiv icon

Understanding the Effects of Iterative Prompting on Truthfulness

Add code
Feb 09, 2024
Viaarxiv icon

Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models

Add code
Feb 08, 2024
Figure 1 for Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models
Figure 2 for Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models
Figure 3 for Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models
Figure 4 for Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models
Viaarxiv icon

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

Add code
Nov 06, 2023
Figure 1 for Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Figure 2 for Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Figure 3 for Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Figure 4 for Quantifying Uncertainty in Natural Language Explanations of Large Language Models
Viaarxiv icon

Investigating the Fairness of Large Language Models for Predictions on Tabular Data

Add code
Oct 23, 2023
Viaarxiv icon