Alert button
Picture for Yutao Sun

Yutao Sun

Alert button

Retentive Network: A Successor to Transformer for Large Language Models

Add code
Bookmark button
Alert button
Aug 09, 2023
Yutao Sun, Li Dong, Shaohan Huang, Shuming Ma, Yuqing Xia, Jilong Xue, Jianyong Wang, Furu Wei

Figure 1 for Retentive Network: A Successor to Transformer for Large Language Models
Figure 2 for Retentive Network: A Successor to Transformer for Large Language Models
Figure 3 for Retentive Network: A Successor to Transformer for Large Language Models
Figure 4 for Retentive Network: A Successor to Transformer for Large Language Models
Viaarxiv icon

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

Add code
Bookmark button
Alert button
Dec 21, 2022
Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei

Figure 1 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Figure 2 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Figure 3 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Figure 4 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Viaarxiv icon

A Length-Extrapolatable Transformer

Add code
Bookmark button
Alert button
Dec 20, 2022
Yutao Sun, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei

Figure 1 for A Length-Extrapolatable Transformer
Figure 2 for A Length-Extrapolatable Transformer
Figure 3 for A Length-Extrapolatable Transformer
Figure 4 for A Length-Extrapolatable Transformer
Viaarxiv icon

Structured Prompting: Scaling In-Context Learning to 1,000 Examples

Add code
Bookmark button
Alert button
Dec 13, 2022
Yaru Hao, Yutao Sun, Li Dong, Zhixiong Han, Yuxian Gu, Furu Wei

Figure 1 for Structured Prompting: Scaling In-Context Learning to 1,000 Examples
Figure 2 for Structured Prompting: Scaling In-Context Learning to 1,000 Examples
Figure 3 for Structured Prompting: Scaling In-Context Learning to 1,000 Examples
Figure 4 for Structured Prompting: Scaling In-Context Learning to 1,000 Examples
Viaarxiv icon