Alert button
Picture for Matt Fredrikson

Matt Fredrikson

Alert button

Transfer Attacks and Defenses for Large Language Models on Coding Tasks

Add code
Bookmark button
Alert button
Nov 22, 2023
Chi Zhang, Zifan Wang, Ravi Mangal, Matt Fredrikson, Limin Jia, Corina Pasareanu

Viaarxiv icon

Is Certifying $\ell_p$ Robustness Still Worthwhile?

Add code
Bookmark button
Alert button
Oct 13, 2023
Ravi Mangal, Klas Leino, Zifan Wang, Kai Hu, Weicheng Yu, Corina Pasareanu, Anupam Datta, Matt Fredrikson

Viaarxiv icon

Representation Engineering: A Top-Down Approach to AI Transparency

Add code
Bookmark button
Alert button
Oct 10, 2023
Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, Dan Hendrycks

Figure 1 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 2 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 3 for Representation Engineering: A Top-Down Approach to AI Transparency
Figure 4 for Representation Engineering: A Top-Down Approach to AI Transparency
Viaarxiv icon

A Recipe for Improved Certifiable Robustness: Capacity and Data

Add code
Bookmark button
Alert button
Oct 04, 2023
Kai Hu, Klas Leino, Zifan Wang, Matt Fredrikson

Figure 1 for A Recipe for Improved Certifiable Robustness: Capacity and Data
Figure 2 for A Recipe for Improved Certifiable Robustness: Capacity and Data
Figure 3 for A Recipe for Improved Certifiable Robustness: Capacity and Data
Figure 4 for A Recipe for Improved Certifiable Robustness: Capacity and Data
Viaarxiv icon

Universal and Transferable Adversarial Attacks on Aligned Language Models

Add code
Bookmark button
Alert button
Jul 27, 2023
Andy Zou, Zifan Wang, J. Zico Kolter, Matt Fredrikson

Figure 1 for Universal and Transferable Adversarial Attacks on Aligned Language Models
Figure 2 for Universal and Transferable Adversarial Attacks on Aligned Language Models
Figure 3 for Universal and Transferable Adversarial Attacks on Aligned Language Models
Figure 4 for Universal and Transferable Adversarial Attacks on Aligned Language Models
Viaarxiv icon

Scaling in Depth: Unlocking Robustness Certification on ImageNet

Add code
Bookmark button
Alert button
Jan 29, 2023
Kai Hu, Andy Zou, Zifan Wang, Klas Leino, Matt Fredrikson

Figure 1 for Scaling in Depth: Unlocking Robustness Certification on ImageNet
Figure 2 for Scaling in Depth: Unlocking Robustness Certification on ImageNet
Figure 3 for Scaling in Depth: Unlocking Robustness Certification on ImageNet
Figure 4 for Scaling in Depth: Unlocking Robustness Certification on ImageNet
Viaarxiv icon

Learning Modulo Theories

Add code
Bookmark button
Alert button
Jan 26, 2023
Matt Fredrikson, Kaiji Lu, Saranya Vijayakumar, Somesh Jha, Vijay Ganesh, Zifan Wang

Figure 1 for Learning Modulo Theories
Figure 2 for Learning Modulo Theories
Figure 3 for Learning Modulo Theories
Figure 4 for Learning Modulo Theories
Viaarxiv icon

Black-Box Audits for Group Distribution Shifts

Add code
Bookmark button
Alert button
Sep 08, 2022
Marc Juarez, Samuel Yeom, Matt Fredrikson

Figure 1 for Black-Box Audits for Group Distribution Shifts
Figure 2 for Black-Box Audits for Group Distribution Shifts
Figure 3 for Black-Box Audits for Group Distribution Shifts
Figure 4 for Black-Box Audits for Group Distribution Shifts
Viaarxiv icon

On the Perils of Cascading Robust Classifiers

Add code
Bookmark button
Alert button
Jun 01, 2022
Ravi Mangal, Zifan Wang, Chi Zhang, Klas Leino, Corina Pasareanu, Matt Fredrikson

Figure 1 for On the Perils of Cascading Robust Classifiers
Figure 2 for On the Perils of Cascading Robust Classifiers
Figure 3 for On the Perils of Cascading Robust Classifiers
Figure 4 for On the Perils of Cascading Robust Classifiers
Viaarxiv icon

Faithful Explanations for Deep Graph Models

Add code
Bookmark button
Alert button
May 24, 2022
Zifan Wang, Yuhang Yao, Chaoran Zhang, Han Zhang, Youjie Kang, Carlee Joe-Wong, Matt Fredrikson, Anupam Datta

Figure 1 for Faithful Explanations for Deep Graph Models
Figure 2 for Faithful Explanations for Deep Graph Models
Figure 3 for Faithful Explanations for Deep Graph Models
Figure 4 for Faithful Explanations for Deep Graph Models
Viaarxiv icon