Alert button
Picture for Besmira Nushi

Besmira Nushi

Alert button

Introducing v0.5 of the AI Safety Benchmark from MLCommons

Add code
Bookmark button
Alert button
Apr 18, 2024
Bertie Vidgen, Adarsh Agrawal, Ahmed M. Ahmed, Victor Akinwande, Namir Al-Nuaimi, Najla Alfaraj, Elie Alhajjar, Lora Aroyo, Trupti Bavalatti, Borhane Blili-Hamelin, Kurt Bollacker, Rishi Bomassani, Marisa Ferrara Boston, Siméon Campos, Kal Chakra, Canyu Chen, Cody Coleman, Zacharie Delpierre Coudert, Leon Derczynski, Debojyoti Dutta, Ian Eisenberg, James Ezick, Heather Frase, Brian Fuller, Ram Gandikota, Agasthya Gangavarapu, Ananya Gangavarapu, James Gealy, Rajat Ghosh, James Goel, Usman Gohar, Sujata Goswami, Scott A. Hale, Wiebke Hutiri, Joseph Marvin Imperial, Surgan Jandial, Nick Judd, Felix Juefei-Xu, Foutse Khomh, Bhavya Kailkhura, Hannah Rose Kirk, Kevin Klyman, Chris Knotz, Michael Kuchnik, Shachi H. Kumar, Chris Lengerich, Bo Li, Zeyi Liao, Eileen Peters Long, Victor Lu, Yifan Mai, Priyanka Mary Mammen, Kelvin Manyeki, Sean McGregor, Virendra Mehta, Shafee Mohammed, Emanuel Moss, Lama Nachman, Dinesh Jinenhally Naganna, Amin Nikanjam, Besmira Nushi, Luis Oala, Iftach Orr, Alicia Parrish, Cigdem Patlak, William Pietri, Forough Poursabzi-Sangdeh, Eleonora Presani, Fabrizio Puletti, Paul Röttger, Saurav Sahay, Tim Santos, Nino Scherrer, Alice Schoenauer Sebag, Patrick Schramowski, Abolfazl Shahbazi, Vin Sharma, Xudong Shen, Vamsi Sistla, Leonard Tang, Davide Testuggine, Vithursan Thangarasa, Elizabeth Anne Watkins, Rebecca Weiss, Chris Welty, Tyler Wilbers, Adina Williams, Carole-Jean Wu, Poonam Yadav, Xianjun Yang, Yi Zeng, Wenhui Zhang, Fedor Zhdanov, Jiacheng Zhu, Percy Liang, Peter Mattson, Joaquin Vanschoren

Viaarxiv icon

Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models

Add code
Bookmark button
Alert button
Apr 09, 2024
Sebastian Bordt, Harsha Nori, Vanessa Rodrigues, Besmira Nushi, Rich Caruana

Viaarxiv icon

KITAB: Evaluating LLMs on Constraint Satisfaction for Information Retrieval

Add code
Bookmark button
Alert button
Oct 24, 2023
Marah I Abdin, Suriya Gunasekar, Varun Chandrasekaran, Jerry Li, Mert Yuksekgonul, Rahee Ghosh Peshawaria, Ranjita Naik, Besmira Nushi

Viaarxiv icon

Diversity of Thought Improves Reasoning Abilities of Large Language Models

Add code
Bookmark button
Alert button
Oct 11, 2023
Ranjita Naik, Varun Chandrasekaran, Mert Yuksekgonul, Hamid Palangi, Besmira Nushi

Figure 1 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Figure 2 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Figure 3 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Figure 4 for Diversity of Thought Improves Reasoning Abilities of Large Language Models
Viaarxiv icon

Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models

Add code
Bookmark button
Alert button
Sep 26, 2023
Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi

Figure 1 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 2 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 3 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Figure 4 for Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Viaarxiv icon

Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning

Add code
Bookmark button
Alert button
Apr 08, 2023
Yu Yang, Besmira Nushi, Hamid Palangi, Baharan Mirzasoleiman

Figure 1 for Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Figure 2 for Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Figure 3 for Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Figure 4 for Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Viaarxiv icon

Social Biases through the Text-to-Image Generation Lens

Add code
Bookmark button
Alert button
Mar 30, 2023
Ranjita Naik, Besmira Nushi

Figure 1 for Social Biases through the Text-to-Image Generation Lens
Figure 2 for Social Biases through the Text-to-Image Generation Lens
Figure 3 for Social Biases through the Text-to-Image Generation Lens
Figure 4 for Social Biases through the Text-to-Image Generation Lens
Viaarxiv icon

Benchmarking Spatial Relationships in Text-to-Image Generation

Add code
Bookmark button
Alert button
Dec 20, 2022
Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang

Figure 1 for Benchmarking Spatial Relationships in Text-to-Image Generation
Figure 2 for Benchmarking Spatial Relationships in Text-to-Image Generation
Figure 3 for Benchmarking Spatial Relationships in Text-to-Image Generation
Figure 4 for Benchmarking Spatial Relationships in Text-to-Image Generation
Viaarxiv icon

Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making

Add code
Bookmark button
Alert button
Aug 16, 2022
Kori Inkpen, Shreya Chappidi, Keri Mallari, Besmira Nushi, Divya Ramesh, Pietro Michelucci, Vani Mandava, Libuše Hannah Vepřek, Gabrielle Quinn

Figure 1 for Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making
Figure 2 for Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making
Figure 3 for Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making
Figure 4 for Advancing Human-AI Complementarity: The Impact of User Expertise and Algorithmic Tuning on Joint Decision Making
Viaarxiv icon

Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging

Add code
Bookmark button
Alert button
May 19, 2022
Riccardo Fogliato, Shreya Chappidi, Matthew Lungren, Michael Fitzke, Mark Parkinson, Diane Wilson, Paul Fisher, Eric Horvitz, Kori Inkpen, Besmira Nushi

Figure 1 for Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging
Figure 2 for Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging
Figure 3 for Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging
Figure 4 for Who Goes First? Influences of Human-AI Workflow on Decision Making in Clinical Imaging
Viaarxiv icon