Alert button
Picture for Buzhou Tang

Buzhou Tang

Alert button

PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain

Oct 22, 2023
Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang

Figure 1 for PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
Figure 2 for PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
Figure 3 for PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
Figure 4 for PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain

Biomedical language understanding benchmarks are the driving forces for artificial intelligence applications with large language model (LLM) back-ends. However, most current benchmarks: (a) are limited to English which makes it challenging to replicate many of the successes in English for other languages, or (b) focus on knowledge probing of LLMs and neglect to evaluate how LLMs apply these knowledge to perform on a wide range of bio-medical tasks, or (c) have become a publicly available corpus and are leaked to LLMs during pre-training. To facilitate the research in medical LLMs, we re-build the Chinese Biomedical Language Understanding Evaluation (CBLUE) benchmark into a large scale prompt-tuning benchmark, PromptCBLUE. Our benchmark is a suitable test-bed and an online platform for evaluating Chinese LLMs' multi-task capabilities on a wide range bio-medical tasks including medical entity recognition, medical text classification, medical natural language inference, medical dialogue understanding and medical content/dialogue generation. To establish evaluation on these tasks, we have experimented and report the results with the current 9 Chinese LLMs fine-tuned with differtent fine-tuning techniques.

Viaarxiv icon

SHAPE: A Sample-adaptive Hierarchical Prediction Network for Medication Recommendation

Sep 09, 2023
Sicen Liu, Xiaolong Wang, JIngcheng Du, Yongshuai Hou, Xianbing Zhao, Hui Xu, Hui Wang, Yang Xiang, Buzhou Tang

Figure 1 for SHAPE: A Sample-adaptive Hierarchical Prediction Network for Medication Recommendation
Figure 2 for SHAPE: A Sample-adaptive Hierarchical Prediction Network for Medication Recommendation
Figure 3 for SHAPE: A Sample-adaptive Hierarchical Prediction Network for Medication Recommendation
Figure 4 for SHAPE: A Sample-adaptive Hierarchical Prediction Network for Medication Recommendation

Effectively medication recommendation with complex multimorbidity conditions is a critical task in healthcare. Most existing works predicted medications based on longitudinal records, which assumed the information transmitted patterns of learning longitudinal sequence data are stable and intra-visit medical events are serialized. However, the following conditions may have been ignored: 1) A more compact encoder for intra-relationship in the intra-visit medical event is urgent; 2) Strategies for learning accurate representations of the variable longitudinal sequences of patients are different. In this paper, we proposed a novel Sample-adaptive Hierarchical medicAtion Prediction nEtwork, termed SHAPE, to tackle the above challenges in the medication recommendation task. Specifically, we design a compact intra-visit set encoder to encode the relationship in the medical event for obtaining visit-level representation and then develop an inter-visit longitudinal encoder to learn the patient-level longitudinal representation efficiently. To endow the model with the capability of modeling the variable visit length, we introduce a soft curriculum learning method to assign the difficulty of each sample automatically by the visit length. Extensive experiments on a benchmark dataset verify the superiority of our model compared with several state-of-the-art baselines.

* 11 pages, 6 figures 
Viaarxiv icon

Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?

Jun 01, 2023
Yuxin He, Jingyue Hu, Buzhou Tang

Figure 1 for Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?
Figure 2 for Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?
Figure 3 for Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?
Figure 4 for Revisiting Event Argument Extraction: Can EAE Models Learn Better When Being Aware of Event Co-occurrences?

Event co-occurrences have been proved effective for event extraction (EE) in previous studies, but have not been considered for event argument extraction (EAE) recently. In this paper, we try to fill this gap between EE research and EAE research, by highlighting the question that ``Can EAE models learn better when being aware of event co-occurrences?''. To answer this question, we reformulate EAE as a problem of table generation and extend a SOTA prompt-based EAE model into a non-autoregressive generation framework, called TabEAE, which is able to extract the arguments of multiple events in parallel. Under this framework, we experiment with 3 different training-inference schemes on 4 datasets (ACE05, RAMS, WikiEvents and MLEE) and discover that via training the model to extract all events in parallel, it can better distinguish the semantic boundary of each event and its ability to extract single event gets substantially improved. Experimental results show that our method achieves new state-of-the-art performance on the 4 datasets. Our code is avilable at https://github.com/Stardust-hyx/TabEAE.

* Accepted to ACL 2023 main conference 
Viaarxiv icon

CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction

Apr 29, 2022
Sicen Liu, Xiaolong Wang, Yang Xiang, Hui Xu, Hui Wang, Buzhou Tang

Figure 1 for CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction
Figure 2 for CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction
Figure 3 for CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction
Figure 4 for CATNet: Cross-event Attention-based Time-aware Network for Medical Event Prediction

Medical event prediction (MEP) is a fundamental task in the medical domain, which needs to predict medical events, including medications, diagnosis codes, laboratory tests, procedures, outcomes, and so on, according to historical medical records. The task is challenging as medical data is a type of complex time series data with heterogeneous and temporal irregular characteristics. Many machine learning methods that consider the two characteristics have been proposed for medical event prediction. However, most of them consider the two characteristics separately and ignore the correlations among different types of medical events, especially relations between historical medical events and target medical events. In this paper, we propose a novel neural network based on attention mechanism, called cross-event attention-based time-aware network (CATNet), for medical event prediction. It is a time-aware, event-aware and task-adaptive method with the following advantages: 1) modeling heterogeneous information and temporal information in a unified way and considering temporal irregular characteristics locally and globally respectively, 2) taking full advantage of correlations among different types of events via cross-event attention. Experiments on two public datasets (MIMIC-III and eICU) show CATNet can be adaptive with different MEP tasks and outperforms other state-of-the-art methods on various MEP tasks. The source code of CATNet will be released after this manuscript is accepted.

* 15 pages,4 figures 
Viaarxiv icon

Two heads are better than one: Enhancing medical representations by pre-training over structured and unstructured electronic health records

Jan 25, 2022
Sicen Liu, Xiaolong Wang, Yongshuai Hou, Ge Li, Hui Wang, Hui Xu, Yang Xiang, Buzhou Tang

Figure 1 for Two heads are better than one: Enhancing medical representations by pre-training over structured and unstructured electronic health records
Figure 2 for Two heads are better than one: Enhancing medical representations by pre-training over structured and unstructured electronic health records
Figure 3 for Two heads are better than one: Enhancing medical representations by pre-training over structured and unstructured electronic health records
Figure 4 for Two heads are better than one: Enhancing medical representations by pre-training over structured and unstructured electronic health records

The massive context of electronic health records (EHRs) has created enormous potentials for improving healthcare, among which structured (coded) data and unstructured (text) data are two important textual modalities. They do not exist in isolation and can complement each other in most real-life clinical scenarios. Most existing researches in medical informatics, however, either only focus on a particular modality or straightforwardly concatenate the information from different modalities, which ignore the interaction and information sharing between them. To address these issues, we proposed a unified deep learning-based medical pre-trained language model, named UMM-PLM, to automatically learn representative features from multimodal EHRs that consist of both structured data and unstructured data. Specifically, we first developed parallel unimodal information representation modules to capture the unimodal-specific characteristic, where unimodal representations were learned from each data source separately. A cross-modal module was further introduced to model the interactions between different modalities. We pre-trained the model on a large EHRs dataset containing both structured data and unstructured data and verified the effectiveness of the model on three downstream clinical tasks, i.e., medication recommendation, 30-day readmission and ICD coding through extensive experiments. The results demonstrate the power of UMM-PLM compared with benchmark methods and state-of-the-art baselines. Analyses show that UMM-PLM can effectively concern with multimodal textual information and has the potential to provide more comprehensive interpretations for clinical decision making.

* 31 pages, 5 figures 
Viaarxiv icon

Learnable Compression Network with Transformer for Approximate Nearest Neighbor Search

Jul 30, 2021
Haokui Zhang, Wenze Hu, Xiaoyu Wang, Buzhou Tang

Figure 1 for Learnable Compression Network with Transformer for Approximate Nearest Neighbor Search
Figure 2 for Learnable Compression Network with Transformer for Approximate Nearest Neighbor Search
Figure 3 for Learnable Compression Network with Transformer for Approximate Nearest Neighbor Search
Figure 4 for Learnable Compression Network with Transformer for Approximate Nearest Neighbor Search

Approximate Nearest neighbor search (ANNS) plays a crucial role in information retrieval, which has a wide range of application scenarios. Therefore, during past several years, a lot of fast ANNS approaches have been proposed. Among these approaches, graph-based methods are one of the most popular type, as they have shown attractive theoretical guarantees and low query latency. In this paper, we propose a learnable compression network with transformer (LCNT), which projects feature vectors from high dimensional space onto low dimensional space, while preserving neighbor relationship. The proposed model can be generalized to existing graph-based methods to accelerate the process of building indexing graph and further reduce query latency. Specifically, the proposed LCNT contains two major parts, projection part and harmonizing part. In the projection part, input vectors are projected into a sequence of subspaces via multi channel sparse projection network. In the harmonizing part, a modified Transformer network is employed to harmonize features in subspaces and combine them to get a new feature. To evaluate the effectiveness of the proposed model, we conduct experiments on two million-scale databases, GIST1M and Deep1M. Experimental results show that the proposed model can improve the speed of building indexing graph to 2-3 times its original speed without sacrificing accuracy significantly. The query latency is reduced by a factor of 1.3 to 2.0. In addition, the proposed model can also be combined with other popular quantization methods.

* 7 pages and 2 figures 
Viaarxiv icon

CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

Jul 06, 2021
Mosha Chen, Chuanqi Tan, Zhen Bi, Xiaozhuan Liang, Lei Li, Ningyu Zhang, Xin Shang, Kangping Yin, Jian Xu, Fei Huang, Luo Si, Yuan Ni, Guotong Xie, Zhifang Sui, Baobao Chang, Hui Zong, Zheng Yuan, Linfeng Li, Jun Yan, Hongying Zan, Kunli Zhang, Buzhou Tang, Qingcai Chen

Figure 1 for CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Figure 2 for CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Figure 3 for CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark
Figure 4 for CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

Artificial Intelligence (AI), along with the recent progress in biomedical language understanding, is gradually changing medical practice. With the development of biomedical language understanding benchmarks, AI applications are widely used in the medical field. However, most benchmarks are limited to English, which makes it challenging to replicate many of the successes in English for other languages. To facilitate research in this direction, we collect real-world biomedical data and present the first Chinese Biomedical Language Understanding Evaluation (CBLUE) benchmark: a collection of natural language understanding tasks including named entity recognition, information extraction, clinical diagnosis normalization, single-sentence/sentence-pair classification, and an associated online platform for model evaluation, comparison, and analysis. To establish evaluation on these tasks, we report empirical results with the current 11 pre-trained Chinese models, and experimental results show that state-of-the-art neural models perform by far worse than the human ceiling. Our benchmark is released at \url{https://tianchi.aliyun.com/dataset/dataDetail?dataId=95414&lang=en-us}.

Viaarxiv icon

Decomposing Word Embedding with the Capsule Network

Apr 07, 2020
Xin Liu, Qingcai Chen, Yan Liu, Baotian Hu, Joanna Siebert, Xiangping Wu, Buzhou Tang

Figure 1 for Decomposing Word Embedding with the Capsule Network
Figure 2 for Decomposing Word Embedding with the Capsule Network
Figure 3 for Decomposing Word Embedding with the Capsule Network
Figure 4 for Decomposing Word Embedding with the Capsule Network

Multi-sense word embeddings have been promising solutions for word sense learning. Nevertheless, building large-scale training corpus and learning appropriate word sense are still open issues. In this paper, we propose a method for Decomposing the word Embedding into context-specific Sense representation, called DecE2S. First, the unsupervised polysemy embedding is fed into capsule network to produce its multiple sememe-like vectors. Second, with attention operations, DecE2S integrates the word context to represent the context-specific sense vector. To train DecE2S, we design a word matching training method for learning the context-specific sense representation. DecE2S was experimentally evaluated on two sense learning tasks, i.e., word in context and word sense disambiguation. Results on two public corpora Word-in-Context and English all-words Word Sense Disambiguation show that, the DesE2S model achieves the new state-of-the-art for the word in context and word sense disambiguation tasks.

Viaarxiv icon

Overview of the CCKS 2019 Knowledge Graph Evaluation Track: Entity, Relation, Event and QA

Mar 09, 2020
Xianpei Han, Zhichun Wang, Jiangtao Zhang, Qinghua Wen, Wenqi Li, Buzhou Tang, Qi Wang, Zhifan Feng, Yang Zhang, Yajuan Lu, Haitao Wang, Wenliang Chen, Hao Shao, Yubo Chen, Kang Liu, Jun Zhao, Taifeng Wang, Kezun Zhang, Meng Wang, Yinlin Jiang, Guilin Qi, Lei Zou, Sen Hu, Minhao Zhang, Yinnian Lin

Knowledge graph models world knowledge as concepts, entities, and the relationships between them, which has been widely used in many real-world tasks. CCKS 2019 held an evaluation track with 6 tasks and attracted more than 1,600 teams. In this paper, we give an overview of the knowledge graph evaluation tract at CCKS 2019. By reviewing the task definition, successful methods, useful resources, good strategies and research challenges associated with each task in CCKS 2019, this paper can provide a helpful reference for developing knowledge graph applications and conducting future knowledge graph researches.

* 21 pages, in Chinese, 9 figures and 17 tables, CCKS 2019 held an evaluation track about knowledge graph with 6 tasks and attracted more than 1,600 teams 
Viaarxiv icon