Alert button
Picture for Li Dong

Li Dong

Alert button

Knowledge Distillation of Large Language Models

Add code
Bookmark button
Alert button
Jun 14, 2023
Yuxian Gu, Li Dong, Furu Wei, Minlie Huang

Figure 1 for Knowledge Distillation of Large Language Models
Figure 2 for Knowledge Distillation of Large Language Models
Figure 3 for Knowledge Distillation of Large Language Models
Figure 4 for Knowledge Distillation of Large Language Models
Viaarxiv icon

Augmenting Language Models with Long-Term Memory

Add code
Bookmark button
Alert button
Jun 12, 2023
Weizhi Wang, Li Dong, Hao Cheng, Xiaodong Liu, Xifeng Yan, Jianfeng Gao, Furu Wei

Figure 1 for Augmenting Language Models with Long-Term Memory
Figure 2 for Augmenting Language Models with Long-Term Memory
Figure 3 for Augmenting Language Models with Long-Term Memory
Figure 4 for Augmenting Language Models with Long-Term Memory
Viaarxiv icon

Pre-Training to Learn in Context

Add code
Bookmark button
Alert button
May 16, 2023
Yuxian Gu, Li Dong, Furu Wei, Minlie Huang

Figure 1 for Pre-Training to Learn in Context
Figure 2 for Pre-Training to Learn in Context
Figure 3 for Pre-Training to Learn in Context
Figure 4 for Pre-Training to Learn in Context
Viaarxiv icon

Language Is Not All You Need: Aligning Perception with Language Models

Add code
Bookmark button
Alert button
Mar 01, 2023
Shaohan Huang, Li Dong, Wenhui Wang, Yaru Hao, Saksham Singhal, Shuming Ma, Tengchao Lv, Lei Cui, Owais Khan Mohammed, Barun Patra, Qiang Liu, Kriti Aggarwal, Zewen Chi, Johan Bjorck, Vishrav Chaudhary, Subhojit Som, Xia Song, Furu Wei

Figure 1 for Language Is Not All You Need: Aligning Perception with Language Models
Figure 2 for Language Is Not All You Need: Aligning Perception with Language Models
Figure 3 for Language Is Not All You Need: Aligning Perception with Language Models
Figure 4 for Language Is Not All You Need: Aligning Perception with Language Models
Viaarxiv icon

Generic-to-Specific Distillation of Masked Autoencoders

Add code
Bookmark button
Alert button
Feb 28, 2023
Wei Huang, Zhiliang Peng, Li Dong, Furu Wei, Jianbin Jiao, Qixiang Ye

Figure 1 for Generic-to-Specific Distillation of Masked Autoencoders
Figure 2 for Generic-to-Specific Distillation of Masked Autoencoders
Figure 3 for Generic-to-Specific Distillation of Masked Autoencoders
Figure 4 for Generic-to-Specific Distillation of Masked Autoencoders
Viaarxiv icon

Semi-Supervised Learning with Pseudo-Negative Labels for Image Classification

Add code
Bookmark button
Alert button
Jan 10, 2023
Hao Xu, Hui Xiao, Huazheng Hao, Li Dong, Xiaojie Qiu, Chengbin Peng

Figure 1 for Semi-Supervised Learning with Pseudo-Negative Labels for Image Classification
Figure 2 for Semi-Supervised Learning with Pseudo-Negative Labels for Image Classification
Figure 3 for Semi-Supervised Learning with Pseudo-Negative Labels for Image Classification
Figure 4 for Semi-Supervised Learning with Pseudo-Negative Labels for Image Classification
Viaarxiv icon

Language Models as Inductive Reasoners

Add code
Bookmark button
Alert button
Dec 21, 2022
Zonglin Yang, Li Dong, Xinya Du, Hao Cheng, Erik Cambria, Xiaodong Liu, Jianfeng Gao, Furu Wei

Figure 1 for Language Models as Inductive Reasoners
Figure 2 for Language Models as Inductive Reasoners
Figure 3 for Language Models as Inductive Reasoners
Figure 4 for Language Models as Inductive Reasoners
Viaarxiv icon

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

Add code
Bookmark button
Alert button
Dec 21, 2022
Damai Dai, Yutao Sun, Li Dong, Yaru Hao, Zhifang Sui, Furu Wei

Figure 1 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Figure 2 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Figure 3 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Figure 4 for Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Viaarxiv icon

A Length-Extrapolatable Transformer

Add code
Bookmark button
Alert button
Dec 20, 2022
Yutao Sun, Li Dong, Barun Patra, Shuming Ma, Shaohan Huang, Alon Benhaim, Vishrav Chaudhary, Xia Song, Furu Wei

Figure 1 for A Length-Extrapolatable Transformer
Figure 2 for A Length-Extrapolatable Transformer
Figure 3 for A Length-Extrapolatable Transformer
Figure 4 for A Length-Extrapolatable Transformer
Viaarxiv icon

GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator

Add code
Bookmark button
Alert button
Dec 20, 2022
Jian Yang, Shuming Ma, Li Dong, Shaohan Huang, Haoyang Huang, Yuwei Yin, Dongdong Zhang, Liqun Yang, Zhoujun Li, Furu Wei

Figure 1 for GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Figure 2 for GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Figure 3 for GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Figure 4 for GanLM: Encoder-Decoder Pre-training with an Auxiliary Discriminator
Viaarxiv icon