Alert button
Picture for Thomas Icard

Thomas Icard

Alert button

A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

Jan 23, 2024
Zhengxuan Wu, Atticus Geiger, Jing Huang, Aryaman Arora, Thomas Icard, Christopher Potts, Noah D. Goodman

Viaarxiv icon

Comparing Causal Frameworks: Potential Outcomes, Structural Models, Graphs, and Abstractions

Jun 25, 2023
Duligur Ibeling, Thomas Icard

Figure 1 for Comparing Causal Frameworks: Potential Outcomes, Structural Models, Graphs, and Abstractions
Figure 2 for Comparing Causal Frameworks: Potential Outcomes, Structural Models, Graphs, and Abstractions
Viaarxiv icon

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

Mar 05, 2023
Atticus Geiger, Zhengxuan Wu, Christopher Potts, Thomas Icard, Noah D. Goodman

Figure 1 for Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Figure 2 for Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Figure 3 for Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Figure 4 for Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations
Viaarxiv icon

Causal Abstraction for Faithful Model Interpretation

Jan 11, 2023
Atticus Geiger, Chris Potts, Thomas Icard

Figure 1 for Causal Abstraction for Faithful Model Interpretation
Figure 2 for Causal Abstraction for Faithful Model Interpretation
Figure 3 for Causal Abstraction for Faithful Model Interpretation
Figure 4 for Causal Abstraction for Faithful Model Interpretation
Viaarxiv icon

Causal Abstraction with Soft Interventions

Nov 22, 2022
Riccardo Massidda, Atticus Geiger, Thomas Icard, Davide Bacciu

Figure 1 for Causal Abstraction with Soft Interventions
Figure 2 for Causal Abstraction with Soft Interventions
Figure 3 for Causal Abstraction with Soft Interventions
Figure 4 for Causal Abstraction with Soft Interventions
Viaarxiv icon

Holistic Evaluation of Language Models

Nov 16, 2022
Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda

Figure 1 for Holistic Evaluation of Language Models
Figure 2 for Holistic Evaluation of Language Models
Figure 3 for Holistic Evaluation of Language Models
Figure 4 for Holistic Evaluation of Language Models
Viaarxiv icon

Causal Distillation for Language Models

Dec 05, 2021
Zhengxuan Wu, Atticus Geiger, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D. Goodman

Figure 1 for Causal Distillation for Language Models
Figure 2 for Causal Distillation for Language Models
Figure 3 for Causal Distillation for Language Models
Figure 4 for Causal Distillation for Language Models
Viaarxiv icon

Inducing Causal Structure for Interpretable Neural Networks

Dec 01, 2021
Atticus Geiger, Zhengxuan Wu, Hanson Lu, Josh Rozner, Elisa Kreiss, Thomas Icard, Noah D. Goodman, Christopher Potts

Figure 1 for Inducing Causal Structure for Interpretable Neural Networks
Figure 2 for Inducing Causal Structure for Interpretable Neural Networks
Figure 3 for Inducing Causal Structure for Interpretable Neural Networks
Figure 4 for Inducing Causal Structure for Interpretable Neural Networks
Viaarxiv icon

On the Opportunities and Risks of Foundation Models

Aug 18, 2021
Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri Chatterji, Annie Chen, Kathleen Creel, Jared Quincy Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel, Noah Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Kohd, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang

Figure 1 for On the Opportunities and Risks of Foundation Models
Figure 2 for On the Opportunities and Risks of Foundation Models
Figure 3 for On the Opportunities and Risks of Foundation Models
Figure 4 for On the Opportunities and Risks of Foundation Models
Viaarxiv icon