Alert button
Picture for Hima Lakkaraju

Hima Lakkaraju

Alert button

A Study on the Calibration of In-context Learning

Add code
Bookmark button
Alert button
Dec 11, 2023
Hanlin Zhang, Yi-Fan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric Xing, Hima Lakkaraju, Sham Kakade

Figure 1 for A Study on the Calibration of In-context Learning
Figure 2 for A Study on the Calibration of In-context Learning
Figure 3 for A Study on the Calibration of In-context Learning
Figure 4 for A Study on the Calibration of In-context Learning
Viaarxiv icon

Certifying LLM Safety against Adversarial Prompting

Add code
Bookmark button
Alert button
Sep 06, 2023
Aounon Kumar, Chirag Agarwal, Suraj Srinivas, Soheil Feizi, Hima Lakkaraju

Viaarxiv icon

Fair Machine Unlearning: Data Removal while Mitigating Disparities

Add code
Bookmark button
Alert button
Jul 27, 2023
Alex Oesterling, Jiaqi Ma, Flavio P. Calmon, Hima Lakkaraju

Figure 1 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Figure 2 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Figure 3 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Figure 4 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Viaarxiv icon

Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness

Add code
Bookmark button
Alert button
May 30, 2023
Suraj Srinivas, Sebastian Bordt, Hima Lakkaraju

Figure 1 for Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness
Figure 2 for Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness
Figure 3 for Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness
Figure 4 for Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness
Viaarxiv icon

Feature Attributions and Counterfactual Explanations Can Be Manipulated

Add code
Bookmark button
Alert button
Jun 23, 2021
Dylan Slack, Sophie Hilgard, Sameer Singh, Hima Lakkaraju

Figure 1 for Feature Attributions and Counterfactual Explanations Can Be Manipulated
Figure 2 for Feature Attributions and Counterfactual Explanations Can Be Manipulated
Figure 3 for Feature Attributions and Counterfactual Explanations Can Be Manipulated
Figure 4 for Feature Attributions and Counterfactual Explanations Can Be Manipulated
Viaarxiv icon