Alert button
Picture for Himabindu Lakkaraju

Himabindu Lakkaraju

Alert button

Manipulating Large Language Models to Increase Product Visibility

Add code
Bookmark button
Alert button
Apr 11, 2024
Aounon Kumar, Himabindu Lakkaraju

Viaarxiv icon

Data Poisoning Attacks on Off-Policy Policy Evaluation Methods

Add code
Bookmark button
Alert button
Apr 06, 2024
Elita Lobo, Harvineet Singh, Marek Petrik, Cynthia Rudin, Himabindu Lakkaraju

Viaarxiv icon

Towards Safe and Aligned Large Language Models for Medicine

Add code
Bookmark button
Alert button
Mar 06, 2024
Tessa Han, Aounon Kumar, Chirag Agarwal, Himabindu Lakkaraju

Figure 1 for Towards Safe and Aligned Large Language Models for Medicine
Figure 2 for Towards Safe and Aligned Large Language Models for Medicine
Viaarxiv icon

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Add code
Bookmark button
Alert button
Feb 27, 2024
Zhenting Qi, Hanlin Zhang, Eric Xing, Sham Kakade, Himabindu Lakkaraju

Viaarxiv icon

Opening the Black Box of Large Language Models: Two Views on Holistic Interpretability

Add code
Bookmark button
Alert button
Feb 16, 2024
Haiyan Zhao, Fan Yang, Himabindu Lakkaraju, Mengnan Du

Viaarxiv icon

Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)

Add code
Bookmark button
Alert button
Feb 16, 2024
Usha Bhalla, Alex Oesterling, Suraj Srinivas, Flavio P. Calmon, Himabindu Lakkaraju

Viaarxiv icon

Understanding the Effects of Iterative Prompting on Truthfulness

Add code
Bookmark button
Alert button
Feb 09, 2024
Satyapriya Krishna, Chirag Agarwal, Himabindu Lakkaraju

Viaarxiv icon

Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models

Add code
Bookmark button
Alert button
Feb 08, 2024
Chirag Agarwal, Sree Harsha Tanneru, Himabindu Lakkaraju

Viaarxiv icon

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

Add code
Bookmark button
Alert button
Nov 06, 2023
Sree Harsha Tanneru, Chirag Agarwal, Himabindu Lakkaraju

Viaarxiv icon

Investigating the Fairness of Large Language Models for Predictions on Tabular Data

Add code
Bookmark button
Alert button
Oct 23, 2023
Yanchen Liu, Srishti Gautam, Jiaqi Ma, Himabindu Lakkaraju

Viaarxiv icon