Alert button
Picture for Kui Xue

Kui Xue

Alert button

MidMed: Towards Mixed-Type Dialogues for Medical Consultation

Jun 14, 2023
Xiaoming Shi, Zeming Liu, Chuan Wang, Haitao Leng, Kui Xue, Xiaofan Zhang, Shaoting Zhang

Figure 1 for MidMed: Towards Mixed-Type Dialogues for Medical Consultation
Figure 2 for MidMed: Towards Mixed-Type Dialogues for Medical Consultation
Figure 3 for MidMed: Towards Mixed-Type Dialogues for Medical Consultation
Figure 4 for MidMed: Towards Mixed-Type Dialogues for Medical Consultation

Most medical dialogue systems assume that patients have clear goals (medicine querying, surgical operation querying, etc.) before medical consultation. However, in many real scenarios, due to the lack of medical knowledge, it is usually difficult for patients to determine clear goals with all necessary slots. In this paper, we identify this challenge as how to construct medical consultation dialogue systems to help patients clarify their goals. To mitigate this challenge, we propose a novel task and create a human-to-human mixed-type medical consultation dialogue corpus, termed MidMed, covering five dialogue types: task-oriented dialogue for diagnosis, recommendation, knowledge-grounded dialogue, QA, and chitchat. MidMed covers four departments (otorhinolaryngology, ophthalmology, skin, and digestive system), with 8,175 dialogues. Furthermore, we build baselines on MidMed and propose an instruction-guiding medical dialogue generation framework, termed InsMed, to address this task. Experimental results show the effectiveness of InsMed.

* Accepted by ACL 2023 main conference. The first two authors contributed equally to this work 
Viaarxiv icon

CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model

Nov 21, 2022
Yongyu Yan, Kui Xue, Qi Ye, Tong Ruan

Figure 1 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model
Figure 2 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model
Figure 3 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model
Figure 4 for CBEAF-Adapting: Enhanced Continual Pretraining for Building Chinese Biomedical Language Model

Continual pretraining is a standard way of building a domain-specific pretrained language model from a general-domain language model. However, sequential task training may cause catastrophic forgetting, which affects the model performance in downstream tasks. In this paper, we propose a continual pretraining method for the BERT-based model, named CBEAF-Adapting (Chinese Biomedical Enhanced Attention-FFN Adapting). Its main idea is to introduce a small number of attention heads and hidden units inside each self-attention layer and feed-forward network. Using the Chinese biomedical domain as a running example, we trained a domain-specific language model named CBEAF-RoBERTa. We conduct experiments by applying models to downstream tasks. The results demonstrate that with only about 3% of model parameters trained, our method could achieve about 0.5%, 2% average performance gain compared to the best performing model in baseline and the domain-specific model, PCL-MedBERT, respectively. We also examine the forgetting problem of different pretraining methods. Our method alleviates the problem by about 13% compared to fine-tuning.

Viaarxiv icon

FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer

Jun 30, 2022
Jingping Liu, Yuqiu Song, Kui Xue, Hongli Sun, Chao Wang, Lihan Chen, Haiyun Jiang, Jiaqing Liang, Tong Ruan

Figure 1 for FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer
Figure 2 for FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer
Figure 3 for FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer
Figure 4 for FL-Tuning: Layer Tuning for Feed-Forward Network in Transformer

Prompt tuning is an emerging way of adapting pre-trained language models to downstream tasks. However, the existing studies are mainly to add prompts to the input sequence. This way would not work as expected due to the intermediate multi-head self-attention and feed-forward network computation, making model optimization not very smooth. Hence, we propose a novel tuning way called layer tuning, aiming to add learnable parameters in Transformer layers. Specifically, we focus on layer tuning for feed-forward network in the Transformer, namely FL-tuning. It introduces additional units into the hidden layer of each feed-forward network. We conduct extensive experiments on the public CLUE benchmark. The results show that: 1) Our FL-tuning outperforms prompt tuning methods under both full-data and few-shot settings in almost all cases. In particular, it improves accuracy by 17.93% (full-data setting) on WSC 1.0 and F1 by 16.142% (few-shot setting) on CLUENER over P-tuning v2. 2) Our FL-tuning is more stable and converges about 1.17 times faster than P-tuning v2. 3) With only about 3% of Transformer's parameters to be trained, FL-tuning is comparable with fine-tuning on most datasets, and significantly outperforms fine-tuning (e.g., accuracy improved by 12.9% on WSC 1.1) on several datasets. The source codes are available at https://github.com/genggui001/FL-Tuning.

Viaarxiv icon

A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization

Jan 22, 2021
Ming Liang, Kui Xue, Tong Ruan

Figure 1 for A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization
Figure 2 for A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization
Figure 3 for A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization
Figure 4 for A multi-perspective combined recall and rank framework for Chinese procedure terminology normalization

Medical terminology normalization aims to map the clinical mention to terminologies come from a knowledge base, which plays an important role in analyzing Electronic Health Record(EHR) and many downstream tasks. In this paper, we focus on Chinese procedure terminology normalization. The expression of terminologies are various and one medical mention may be linked to multiple terminologies. Previous study explores some methods such as multi-class classification or learning to rank(LTR) to sort the terminologies by literature and semantic information. However, these information is inadequate to find the right terminologies, particularly in multi-implication cases. In this work, we propose a combined recall and rank framework to solve the above problems. This framework is composed of a multi-task candidate generator(MTCG), a keywords attentive ranker(KAR) and a fusion block(FB). MTCG is utilized to predict the mention implication number and recall candidates with semantic similarity. KAR is based on Bert with a keywords attentive mechanism which focuses on keywords such as procedure sites and procedure types. FB merges the similarity come from MTCG and KAR to sort the terminologies from different perspectives. Detailed experimental analysis shows our proposed framework has a remarkable improvement on both performance and efficiency.

Viaarxiv icon

Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text

Aug 21, 2019
Kui Xue, Yangming Zhou, Zhiyuan Ma, Tong Ruan, Huanhuan Zhang, Ping He

Figure 1 for Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text
Figure 2 for Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text
Figure 3 for Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text
Figure 4 for Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text

Entity and relation extraction is the necessary step in structuring medical text. However, the feature extraction ability of the bidirectional long short term memory network in the existing model does not achieve the best effect. At the same time, the language model has achieved excellent results in more and more natural language processing tasks. In this paper, we present a focused attention model for the joint entity and relation extraction task. Our model integrates well-known BERT language model into joint learning through dynamic range attention mechanism, thus improving the feature representation ability of shared parameter layer. Experimental results on coronary angiography texts collected from Shuguang Hospital show that the F1-score of named entity recognition and relation classification tasks reach 96.89% and 88.51%, which are better than state-of-the-art methods 1.65% and 1.22%, respectively.

* 8 pages, 2 figures, submitted to BIBM 2019 
Viaarxiv icon