Alert button
Picture for Atoosa Kasirzadeh

Atoosa Kasirzadeh

Alert button

Foundational Challenges in Assuring Alignment and Safety of Large Language Models

Add code
Bookmark button
Alert button
Apr 15, 2024
Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Yoshua Bengio, Danqi Chen, Samuel Albanie, Tegan Maharaj, Jakob Foerster, Florian Tramer, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger

Viaarxiv icon

A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)

Add code
Bookmark button
Alert button
Mar 31, 2024
Yashar Deldjoo, Zhankui He, Julian McAuley, Anton Korikov, Scott Sanner, Arnau Ramisa, René Vidal, Maheswaran Sathiamoorthy, Atoosa Kasirzadeh, Silvia Milano

Viaarxiv icon

Discipline and Label: A WEIRD Genealogy and Social Theory of Data Annotation

Add code
Bookmark button
Alert button
Feb 09, 2024
Andrew Smart, Ding Wang, Ellis Monk, Mark Díaz, Atoosa Kasirzadeh, Erin Van Liemt, Sonja Schmer-Galunder

Viaarxiv icon

Two Types of AI Existential Risk: Decisive and Accumulative

Add code
Bookmark button
Alert button
Jan 15, 2024
Atoosa Kasirzadeh

Figure 1 for Two Types of AI Existential Risk: Decisive and Accumulative
Viaarxiv icon

ChatGPT, Large Language Technologies, and the Bumpy Road of Benefiting Humanity

Add code
Bookmark button
Alert button
Apr 21, 2023
Atoosa Kasirzadeh

Viaarxiv icon

In conversation with Artificial Intelligence: aligning language models with human values

Add code
Bookmark button
Alert button
Sep 01, 2022
Atoosa Kasirzadeh, Iason Gabriel

Viaarxiv icon

Ethical and social risks of harm from Language Models

Add code
Bookmark button
Alert button
Dec 08, 2021
Laura Weidinger, John Mellor, Maribeth Rauh, Conor Griffin, Jonathan Uesato, Po-Sen Huang, Myra Cheng, Mia Glaese, Borja Balle, Atoosa Kasirzadeh, Zac Kenton, Sasha Brown, Will Hawkins, Tom Stepleton, Courtney Biles, Abeba Birhane, Julia Haas, Laura Rimell, Lisa Anne Hendricks, William Isaac, Sean Legassick, Geoffrey Irving, Iason Gabriel

Figure 1 for Ethical and social risks of harm from Language Models
Figure 2 for Ethical and social risks of harm from Language Models
Viaarxiv icon

User Tampering in Reinforcement Learning Recommender Systems

Add code
Bookmark button
Alert button
Sep 09, 2021
Charles Evans, Atoosa Kasirzadeh

Figure 1 for User Tampering in Reinforcement Learning Recommender Systems
Figure 2 for User Tampering in Reinforcement Learning Recommender Systems
Figure 3 for User Tampering in Reinforcement Learning Recommender Systems
Figure 4 for User Tampering in Reinforcement Learning Recommender Systems
Viaarxiv icon

Reasons, Values, Stakeholders: A Philosophical Framework for Explainable Artificial Intelligence

Add code
Bookmark button
Alert button
Mar 01, 2021
Atoosa Kasirzadeh

Figure 1 for Reasons, Values, Stakeholders: A Philosophical Framework for Explainable Artificial Intelligence
Figure 2 for Reasons, Values, Stakeholders: A Philosophical Framework for Explainable Artificial Intelligence
Figure 3 for Reasons, Values, Stakeholders: A Philosophical Framework for Explainable Artificial Intelligence
Figure 4 for Reasons, Values, Stakeholders: A Philosophical Framework for Explainable Artificial Intelligence
Viaarxiv icon

Mathematical decisions and non-causal elements of explainable AI

Add code
Bookmark button
Alert button
Dec 12, 2019
Atoosa Kasirzadeh

Figure 1 for Mathematical decisions and non-causal elements of explainable AI
Figure 2 for Mathematical decisions and non-causal elements of explainable AI
Figure 3 for Mathematical decisions and non-causal elements of explainable AI
Viaarxiv icon