Alert button
Picture for Shibani Santurkar

Shibani Santurkar

Alert button

Whose Opinions Do Language Models Reflect?

Mar 30, 2023
Shibani Santurkar, Esin Durmus, Faisal Ladhak, Cinoo Lee, Percy Liang, Tatsunori Hashimoto

Figure 1 for Whose Opinions Do Language Models Reflect?
Figure 2 for Whose Opinions Do Language Models Reflect?
Figure 3 for Whose Opinions Do Language Models Reflect?
Figure 4 for Whose Opinions Do Language Models Reflect?
Viaarxiv icon

Data Selection for Language Models via Importance Resampling

Feb 06, 2023
Sang Michael Xie, Shibani Santurkar, Tengyu Ma, Percy Liang

Figure 1 for Data Selection for Language Models via Importance Resampling
Figure 2 for Data Selection for Language Models via Importance Resampling
Figure 3 for Data Selection for Language Models via Importance Resampling
Figure 4 for Data Selection for Language Models via Importance Resampling
Viaarxiv icon

Holistic Evaluation of Language Models

Nov 16, 2022
Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda

Figure 1 for Holistic Evaluation of Language Models
Figure 2 for Holistic Evaluation of Language Models
Figure 3 for Holistic Evaluation of Language Models
Figure 4 for Holistic Evaluation of Language Models
Viaarxiv icon

Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning

Jul 15, 2022
Shibani Santurkar, Yann Dubois, Rohan Taori, Percy Liang, Tatsunori Hashimoto

Figure 1 for Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning
Figure 2 for Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning
Figure 3 for Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning
Figure 4 for Is a Caption Worth a Thousand Images? A Controlled Study for Representation Learning
Viaarxiv icon

Editing a classifier by rewriting its prediction rules

Dec 02, 2021
Shibani Santurkar, Dimitris Tsipras, Mahalaxmi Elango, David Bau, Antonio Torralba, Aleksander Madry

Figure 1 for Editing a classifier by rewriting its prediction rules
Figure 2 for Editing a classifier by rewriting its prediction rules
Figure 3 for Editing a classifier by rewriting its prediction rules
Figure 4 for Editing a classifier by rewriting its prediction rules
Viaarxiv icon

3DB: A Framework for Debugging Computer Vision Models

Jun 07, 2021
Guillaume Leclerc, Hadi Salman, Andrew Ilyas, Sai Vemprala, Logan Engstrom, Vibhav Vineet, Kai Xiao, Pengchuan Zhang, Shibani Santurkar, Greg Yang, Ashish Kapoor, Aleksander Madry

Figure 1 for 3DB: A Framework for Debugging Computer Vision Models
Figure 2 for 3DB: A Framework for Debugging Computer Vision Models
Figure 3 for 3DB: A Framework for Debugging Computer Vision Models
Figure 4 for 3DB: A Framework for Debugging Computer Vision Models
Viaarxiv icon

Leveraging Sparse Linear Layers for Debuggable Deep Networks

May 11, 2021
Eric Wong, Shibani Santurkar, Aleksander Mądry

Figure 1 for Leveraging Sparse Linear Layers for Debuggable Deep Networks
Figure 2 for Leveraging Sparse Linear Layers for Debuggable Deep Networks
Figure 3 for Leveraging Sparse Linear Layers for Debuggable Deep Networks
Figure 4 for Leveraging Sparse Linear Layers for Debuggable Deep Networks
Viaarxiv icon

BREEDS: Benchmarks for Subpopulation Shift

Aug 11, 2020
Shibani Santurkar, Dimitris Tsipras, Aleksander Madry

Figure 1 for BREEDS: Benchmarks for Subpopulation Shift
Figure 2 for BREEDS: Benchmarks for Subpopulation Shift
Figure 3 for BREEDS: Benchmarks for Subpopulation Shift
Figure 4 for BREEDS: Benchmarks for Subpopulation Shift
Viaarxiv icon

Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO

May 25, 2020
Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Figure 1 for Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
Figure 2 for Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
Figure 3 for Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
Figure 4 for Implementation Matters in Deep Policy Gradients: A Case Study on PPO and TRPO
Viaarxiv icon

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

May 22, 2020
Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Andrew Ilyas, Aleksander Madry

Figure 1 for From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
Figure 2 for From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
Figure 3 for From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
Figure 4 for From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
Viaarxiv icon