Alert button
Picture for Stephen Fitz

Stephen Fitz

Alert button

The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning

Add code
Bookmark button
Alert button
Mar 06, 2024
Nathaniel Li, Alexander Pan, Anjali Gopal, Summer Yue, Daniel Berrios, Alice Gatti, Justin D. Li, Ann-Kathrin Dombrowski, Shashwat Goel, Long Phan, Gabriel Mukobi, Nathan Helm-Burger, Rassin Lababidi, Lennart Justen, Andrew B. Liu, Michael Chen, Isabelle Barrass, Oliver Zhang, Xiaoyuan Zhu, Rishub Tamirisa, Bhrugu Bharathi, Adam Khoja, Zhenqi Zhao, Ariel Herbert-Voss, Cort B. Breuer, Andy Zou, Mantas Mazeika, Zifan Wang, Palash Oswal, Weiran Liu, Adam A. Hunt, Justin Tienken-Harder, Kevin Y. Shih, Kemper Talley, John Guan, Russell Kaplan, Ian Steneker, David Campbell, Brad Jokubaitis, Alex Levinson, Jean Wang, William Qian, Kallol Krishna Karmakar, Steven Basart, Stephen Fitz, Mindy Levine, Ponnurangam Kumaraguru, Uday Tupakula, Vijay Varadharajan, Yan Shoshitaishvili, Jimmy Ba, Kevin M. Esvelt, Alexandr Wang, Dan Hendrycks

Figure 1 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 2 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 3 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Figure 4 for The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning
Viaarxiv icon

Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings

Add code
Bookmark button
Alert button
Sep 17, 2023
Stephen Fitz

Figure 1 for Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings
Figure 2 for Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings
Figure 3 for Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings
Figure 4 for Do Large GPT Models Discover Moral Dimensions in Language Representations? A Topological Study Of Sentence Embeddings
Viaarxiv icon

Personality Traits in Large Language Models

Add code
Bookmark button
Alert button
Jul 01, 2023
Mustafa Safdari, Greg Serapio-García, Clément Crepy, Stephen Fitz, Peter Romero, Luning Sun, Marwa Abdulhai, Aleksandra Faust, Maja Matarić

Figure 1 for Personality Traits in Large Language Models
Figure 2 for Personality Traits in Large Language Models
Figure 3 for Personality Traits in Large Language Models
Figure 4 for Personality Traits in Large Language Models
Viaarxiv icon

Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations

Add code
Bookmark button
Alert button
Jun 01, 2021
Louis Castricato, Stephen Fitz, Won Young Shin

Figure 1 for Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations
Figure 2 for Parameter-Efficient Neural Question Answering Models via Graph-Enriched Document Representations
Viaarxiv icon