Alert button
Picture for Himabindu Lakkaraju

Himabindu Lakkaraju

Alert button

Towards Safe and Aligned Large Language Models for Medicine

Mar 06, 2024
Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju

Figure 1 for Towards Safe and Aligned Large Language Models for Medicine
Figure 2 for Towards Safe and Aligned Large Language Models for Medicine
Viaarxiv icon

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Feb 27, 2024
Zhenting Qi, Hanlin Zhang, Eric Xing, Sham Kakade, Himabindu Lakkaraju

Viaarxiv icon

Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability

Feb 16, 2024
Haiyan Zhao, Fan Yang, Himabindu Lakkaraju, Mengnan Du

Viaarxiv icon

Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)

Feb 16, 2024
Usha Bhalla, Alex Oesterling, Suraj Srinivas, Flavio P. Calmon, Himabindu Lakkaraju

Viaarxiv icon

Understanding the Effects of Iterative Prompting on Truthfulness

Feb 09, 2024
Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju

Viaarxiv icon

Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models

Feb 08, 2024
Chirag Agarwal, Sree Harsha Tanneru, Himabindu Lakkaraju

Viaarxiv icon

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

Nov 06, 2023
Sree Harsha Tanneru, Chirag Agarwal, Himabindu Lakkaraju

Viaarxiv icon

Investigating the Fairness of Large Language Models for Predictions on Tabular Data

Oct 23, 2023
Yanchen Liu, Srishti Gautam, Jiaqi Ma, Himabindu Lakkaraju

Viaarxiv icon

In-Context Unlearning: Language Models as Few Shot Unlearners

Oct 12, 2023
Martin Pawelczyk, Seth Neel, Himabindu Lakkaraju

Figure 1 for In-Context Unlearning: Language Models as Few Shot Unlearners
Figure 2 for In-Context Unlearning: Language Models as Few Shot Unlearners
Figure 3 for In-Context Unlearning: Language Models as Few Shot Unlearners
Figure 4 for In-Context Unlearning: Language Models as Few Shot Unlearners
Viaarxiv icon