Alert button
Picture for Susan Zhang

Susan Zhang

Alert button

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning

Sep 05, 2023
Lili Yu, Bowen Shi, Ramakanth Pasunuru, Benjamin Muller, Olga Golovneva, Tianlu Wang, Arun Babu, Binh Tang, Brian Karrer, Shelly Sheynin, Candace Ross, Adam Polyak, Russell Howes, Vasu Sharma, Puxin Xu, Hovhannes Tamoyan, Oron Ashual, Uriel Singer, Shang-Wen Li, Susan Zhang, Richard James, Gargi Ghosh, Yaniv Taigman, Maryam Fazel-Zarandi, Asli Celikyilmaz, Luke Zettlemoyer, Armen Aghajanyan

Figure 1 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 2 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 3 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Figure 4 for Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning
Viaarxiv icon

LIMA: Less Is More for Alignment

May 18, 2023
Chunting Zhou, Pengfei Liu, Puxin Xu, Srini Iyer, Jiao Sun, Yuning Mao, Xuezhe Ma, Avia Efrat, Ping Yu, Lili Yu, Susan Zhang, Gargi Ghosh, Mike Lewis, Luke Zettlemoyer, Omer Levy

Figure 1 for LIMA: Less Is More for Alignment
Figure 2 for LIMA: Less Is More for Alignment
Figure 3 for LIMA: Less Is More for Alignment
Figure 4 for LIMA: Less Is More for Alignment
Viaarxiv icon

A Theory on Adam Instability in Large-Scale Machine Learning

Apr 25, 2023
Igor Molybog, Peter Albert, Moya Chen, Zachary DeVito, David Esiobu, Naman Goyal, Punit Singh Koura, Sharan Narang, Andrew Poulton, Ruan Silva, Binh Tang, Diana Liskovich, Puxin Xu, Yuchen Zhang, Melanie Kambadur, Stephen Roller, Susan Zhang

Figure 1 for A Theory on Adam Instability in Large-Scale Machine Learning
Figure 2 for A Theory on Adam Instability in Large-Scale Machine Learning
Figure 3 for A Theory on Adam Instability in Large-Scale Machine Learning
Figure 4 for A Theory on Adam Instability in Large-Scale Machine Learning
Viaarxiv icon

Effective Theory of Transformers at Initialization

Apr 04, 2023
Emily Dinan, Sho Yaida, Susan Zhang

Figure 1 for Effective Theory of Transformers at Initialization
Figure 2 for Effective Theory of Transformers at Initialization
Figure 3 for Effective Theory of Transformers at Initialization
Figure 4 for Effective Theory of Transformers at Initialization
Viaarxiv icon

Scaling Laws for Generative Mixed-Modal Language Models

Jan 10, 2023
Armen Aghajanyan, Lili Yu, Alexis Conneau, Wei-Ning Hsu, Karen Hambardzumyan, Susan Zhang, Stephen Roller, Naman Goyal, Omer Levy, Luke Zettlemoyer

Figure 1 for Scaling Laws for Generative Mixed-Modal Language Models
Figure 2 for Scaling Laws for Generative Mixed-Modal Language Models
Figure 3 for Scaling Laws for Generative Mixed-Modal Language Models
Figure 4 for Scaling Laws for Generative Mixed-Modal Language Models
Viaarxiv icon

OPT: Open Pre-trained Transformer Language Models

May 05, 2022
Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, Luke Zettlemoyer

Figure 1 for OPT: Open Pre-trained Transformer Language Models
Figure 2 for OPT: Open Pre-trained Transformer Language Models
Figure 3 for OPT: Open Pre-trained Transformer Language Models
Figure 4 for OPT: Open Pre-trained Transformer Language Models
Viaarxiv icon

Long-Term Planning and Situational Awareness in OpenAI Five

Dec 13, 2019
Jonathan Raiman, Susan Zhang, Filip Wolski

Figure 1 for Long-Term Planning and Situational Awareness in OpenAI Five
Figure 2 for Long-Term Planning and Situational Awareness in OpenAI Five
Figure 3 for Long-Term Planning and Situational Awareness in OpenAI Five
Figure 4 for Long-Term Planning and Situational Awareness in OpenAI Five
Viaarxiv icon

Neural Network Surgery with Sets

Dec 13, 2019
Jonathan Raiman, Susan Zhang, Christy Dennison

Figure 1 for Neural Network Surgery with Sets
Figure 2 for Neural Network Surgery with Sets
Figure 3 for Neural Network Surgery with Sets
Figure 4 for Neural Network Surgery with Sets
Viaarxiv icon

Dota 2 with Large Scale Deep Reinforcement Learning

Dec 13, 2019
OpenAI, :, Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, Rafal Józefowicz, Scott Gray, Catherine Olsson, Jakub Pachocki, Michael Petrov, Henrique Pondé de Oliveira Pinto, Jonathan Raiman, Tim Salimans, Jeremy Schlatter, Jonas Schneider, Szymon Sidor, Ilya Sutskever, Jie Tang, Filip Wolski, Susan Zhang

Figure 1 for Dota 2 with Large Scale Deep Reinforcement Learning
Figure 2 for Dota 2 with Large Scale Deep Reinforcement Learning
Figure 3 for Dota 2 with Large Scale Deep Reinforcement Learning
Figure 4 for Dota 2 with Large Scale Deep Reinforcement Learning
Viaarxiv icon