Alert button
Picture for Yongyu Yan

Yongyu Yan

Alert button

CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model

Nov 21, 2022
Yongyu Yan, Kui Xue, Qi Ye, Tong Ruan

Figure 1 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model
Figure 2 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model
Figure 3 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model
Figure 4 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model

Continual pretraining is a standard way of building a domain-specific pretrained language model from a general-domain language model. However, sequential task training may cause catastrophic forgetting, which affects the model performance in downstream tasks. In this paper, we propose a continual pretraining method for the BERT-based model, named CBEAF-Adapting (Chinese Biomedical Enhanced Attention-FFN Adapting). Its main idea is to introduce a small number of attention heads and hidden units inside each self-attention layer and feed-forward network. Using the Chinese biomedical domain as a running example, we trained a domain-specific language model named CBEAF-RoBERTa. We conduct experiments by applying models to downstream tasks. The results demonstrate that with only about 3% of model parameters trained, our method could achieve about 0.5%, 2% average performance gain compared to the best performing model in baseline and the domain-specific model, PCL-MedBERT, respectively. We also examine the forgetting problem of different pretraining methods. Our method alleviates the problem by about 13% compared to fine-tuning.

Viaarxiv icon