Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation

Apr 26, 2020

Hao Cheng, Fanxu Meng, Ke Li, Huixiang Luo, Guangming Lu, Xiaowei Guo, Feiyue Huang, Xing Sun

Figure 1 for DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation

Figure 2 for DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation

Figure 3 for DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation

Figure 4 for DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation

Share this with someone who'll enjoy it:

Abstract:With a fixed model structure, knowledge distillation and filter grafting are two effective ways to boost single model accuracy. However, the working mechanism and the differences between distillation and grafting have not been fully unveiled. In this paper, we evaluate the effect of distillation and grafting in the filter level, and find that the impacts of the two techniques are surprisingly complementary: distillation mostly enhances the knowledge of valid filters while grafting mostly reactivates invalid filters. This observation guides us to design a unified training framework called DGD, where distillation and grafting are naturally combined to increase the knowledge density inside the filters given a fixed model structure. Through extensive experiments, we show that the knowledge densified network in DGD shares both advantages of distillation and grafting, lifting the model accuracy to a higher level.

View paper on

Share this with someone who'll enjoy it:

Title:DGD: Densifying the Knowledge of Neural Networks with Filter Grafting and Knowledge Distillation

Paper and Code