Alert button
Picture for Sheng Shen

Sheng Shen

Alert button

Crosslingual Generalization through Multitask Finetuning

Add code
Bookmark button
Alert button
Nov 03, 2022
Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M Saiful Bari, Sheng Shen, Zheng-Xin Yong, Hailey Schoelkopf, Xiangru Tang, Dragomir Radev, Alham Fikri Aji, Khalid Almubarak, Samuel Albanie, Zaid Alyafeai, Albert Webson, Edward Raff, Colin Raffel

Viaarxiv icon

What Language Model to Train if You Have One Million GPU Hours?

Add code
Bookmark button
Alert button
Oct 27, 2022
Teven Le Scao, Thomas Wang, Daniel Hesslow, Lucile Saulnier, Stas Bekman, M Saiful Bari, Stella Bideman, Hady Elsahar, Niklas Muennighoff, Jason Phang, Ofir Press, Colin Raffel, Victor Sanh, Sheng Shen, Lintang Sutawika, Jaesung Tae, Zheng Xin Yong, Julien Launay, Iz Beltagy

Figure 1 for What Language Model to Train if You Have One Million GPU Hours?
Figure 2 for What Language Model to Train if You Have One Million GPU Hours?
Figure 3 for What Language Model to Train if You Have One Million GPU Hours?
Figure 4 for What Language Model to Train if You Have One Million GPU Hours?
Viaarxiv icon

ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution

Add code
Bookmark button
Alert button
Oct 17, 2022
Sheng Shen, Huanjing Yue, Jingyu Yang, Kun Li

Figure 1 for ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution
Figure 2 for ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution
Figure 3 for ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution
Figure 4 for ITSRN++: Stronger and Better Implicit Transformer Network for Continuous Screen Content Image Super-Resolution
Viaarxiv icon

K-LITE: Learning Transferable Visual Models with External Knowledge

Add code
Bookmark button
Alert button
Apr 20, 2022
Sheng Shen, Chunyuan Li, Xiaowei Hu, Yujia Xie, Jianwei Yang, Pengchuan Zhang, Anna Rohrbach, Zhe Gan, Lijuan Wang, Lu Yuan, Ce Liu, Kurt Keutzer, Trevor Darrell, Jianfeng Gao

Figure 1 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 2 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 3 for K-LITE: Learning Transferable Visual Models with External Knowledge
Figure 4 for K-LITE: Learning Transferable Visual Models with External Knowledge
Viaarxiv icon

One Parameter Defense -- Defending against Data Inference Attacks via Differential Privacy

Add code
Bookmark button
Alert button
Mar 13, 2022
Dayong Ye, Sheng Shen, Tianqing Zhu, Bo Liu, Wanlei Zhou

Viaarxiv icon

Staged Training for Transformer Language Models

Add code
Bookmark button
Alert button
Mar 11, 2022
Sheng Shen, Pete Walsh, Kurt Keutzer, Jesse Dodge, Matthew Peters, Iz Beltagy

Figure 1 for Staged Training for Transformer Language Models
Figure 2 for Staged Training for Transformer Language Models
Figure 3 for Staged Training for Transformer Language Models
Figure 4 for Staged Training for Transformer Language Models
Viaarxiv icon

Implicit Transformer Network for Screen Content Image Continuous Super-Resolution

Add code
Bookmark button
Alert button
Dec 12, 2021
Jingyu Yang, Sheng Shen, Huanjing Yue, Kun Li

Figure 1 for Implicit Transformer Network for Screen Content Image Continuous Super-Resolution
Figure 2 for Implicit Transformer Network for Screen Content Image Continuous Super-Resolution
Figure 3 for Implicit Transformer Network for Screen Content Image Continuous Super-Resolution
Figure 4 for Implicit Transformer Network for Screen Content Image Continuous Super-Resolution
Viaarxiv icon

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Add code
Bookmark button
Alert button
Oct 27, 2021
Xuanlin Li, Brandon Trabucco, Dong Huk Park, Michael Luo, Sheng Shen, Trevor Darrell, Yang Gao

Figure 1 for Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Figure 2 for Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Figure 3 for Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Figure 4 for Discovering Non-monotonic Autoregressive Orderings with Variational Inference
Viaarxiv icon

Multitask Prompted Training Enables Zero-Shot Task Generalization

Add code
Bookmark button
Alert button
Oct 15, 2021
Victor Sanh, Albert Webson, Colin Raffel, Stephen H. Bach, Lintang Sutawika, Zaid Alyafeai, Antoine Chaffin, Arnaud Stiegler, Teven Le Scao, Arun Raja, Manan Dey, M Saiful Bari, Canwen Xu, Urmish Thakker, Shanya Sharma Sharma, Eliza Szczechla, Taewoon Kim, Gunjan Chhablani, Nihal Nayak, Debajyoti Datta, Jonathan Chang, Mike Tian-Jian Jiang, Han Wang, Matteo Manica, Sheng Shen, Zheng Xin Yong, Harshit Pandey, Rachel Bawden, Thomas Wang, Trishala Neeraj, Jos Rozen, Abheesht Sharma, Andrea Santilli, Thibault Fevry, Jason Alan Fries, Ryan Teehan, Stella Biderman, Leo Gao, Tali Bers, Thomas Wolf, Alexander M. Rush

Figure 1 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 2 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 3 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Figure 4 for Multitask Prompted Training Enables Zero-Shot Task Generalization
Viaarxiv icon

What's Hidden in a One-layer Randomly Weighted Transformer?

Add code
Bookmark button
Alert button
Sep 08, 2021
Sheng Shen, Zhewei Yao, Douwe Kiela, Kurt Keutzer, Michael W. Mahoney

Figure 1 for What's Hidden in a One-layer Randomly Weighted Transformer?
Figure 2 for What's Hidden in a One-layer Randomly Weighted Transformer?
Figure 3 for What's Hidden in a One-layer Randomly Weighted Transformer?
Figure 4 for What's Hidden in a One-layer Randomly Weighted Transformer?
Viaarxiv icon