Alert button
Picture for Yuanchun Wang

Yuanchun Wang

Alert button

GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model

Add code
Bookmark button
Alert button
Jun 11, 2023
Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Yang Yang, Hongyin Tang, Keqing He, Jiahao Liu, Jingang Wang, Shu Zhao, Peng Zhang, Jie Tang

Figure 1 for GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Figure 2 for GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Figure 3 for GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Figure 4 for GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Viaarxiv icon

Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method

Add code
Bookmark button
Alert button
Jun 11, 2023
Shicheng Tan, Weng Lam Tam, Yuanchun Wang, Wenwen Gong, Shu Zhao, Peng Zhang, Jie Tang

Figure 1 for Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method
Figure 2 for Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method
Figure 3 for Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method
Figure 4 for Are Intermediate Layers and Labels Really Necessary? A General Language Model Distillation Method
Viaarxiv icon