Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tao Qi

Improving Attention Mechanism with Query-Value Interaction

Oct 08, 2020
Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang

Figure 1 for Improving Attention Mechanism with Query-Value Interaction

Figure 2 for Improving Attention Mechanism with Query-Value Interaction

Figure 3 for Improving Attention Mechanism with Query-Value Interaction

Figure 4 for Improving Attention Mechanism with Query-Value Interaction

Attention mechanism has played critical roles in various state-of-the-art NLP models such as Transformer and BERT. It can be formulated as a ternary function that maps the input queries, keys and values into an output by using a summation of values weighted by the attention weights derived from the interactions between queries and keys. Similar with query-key interactions, there is also inherent relatedness between queries and values, and incorporating query-value interactions has the potential to enhance the output by learning customized values according to the characteristics of queries. However, the query-value interactions are ignored by existing attention methods, which may be not optimal. In this paper, we propose to improve the existing attention mechanism by incorporating query-value interactions. We propose a query-value interaction function which can learn query-aware attention values, and combine them with the original values and attention weights to form the final output. Extensive experiments on four datasets for different tasks show that our approach can consistently improve the performance of many attention-based models by incorporating query-value interactions.

Via

Access Paper or Ask Questions

Graph Enhanced Representation Learning for News Recommendation

Mar 31, 2020
Suyu Ge, Chuhan Wu, Fangzhao Wu, Tao Qi, Yongfeng Huang

Figure 1 for Graph Enhanced Representation Learning for News Recommendation

Figure 2 for Graph Enhanced Representation Learning for News Recommendation

Figure 3 for Graph Enhanced Representation Learning for News Recommendation

Figure 4 for Graph Enhanced Representation Learning for News Recommendation

With the explosion of online news, personalized news recommendation becomes increasingly important for online news platforms to help their users find interesting information. Existing news recommendation methods achieve personalization by building accurate news representations from news content and user representations from their direct interactions with news (e.g., click), while ignoring the high-order relatedness between users and news. Here we propose a news recommendation method which can enhance the representation learning of users and news by modeling their relatedness in a graph setting. In our method, users and news are both viewed as nodes in a bipartite graph constructed from historical user click behaviors. For news representations, a transformer architecture is first exploited to build news semantic representations. Then we combine it with the information from neighbor news in the graph via a graph attention network. For user representations, we not only represent users from their historically clicked news, but also attentively incorporate the representations of their neighbor users in the graph. Improved performances on a large-scale real-world dataset validate the effectiveness of our proposed method.

Via

Access Paper or Ask Questions

FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning

Mar 25, 2020
Suyu Ge, Fangzhao Wu, Chuhan Wu, Tao Qi, Yongfeng Huang, Xing Xie

Figure 1 for FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning

Figure 2 for FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning

Figure 3 for FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning

Figure 4 for FedNER: Privacy-preserving Medical Named Entity Recognition with Federated Learning

Medical named entity recognition (NER) has wide applications in intelligent healthcare. Sufficient labeled data is critical for training accurate medical NER model. However, the labeled data in a single medical platform is usually limited. Although labeled datasets may exist in many different medical platforms, they cannot be directly shared since medical data is highly privacy-sensitive. In this paper, we propose a privacy-preserving medical NER method based on federated learning, which can leverage the labeled data in different platforms to boost the training of medical NER model and remove the need of exchanging raw data among different platforms. Since the labeled data in different platforms usually has some differences in entity type and annotation criteria, instead of constraining different platforms to share the same model, we decompose the medical NER model in each platform into a shared module and a private module. The private module is used to capture the characteristics of the local data in each platform, and is updated using local labeled data. The shared module is learned across different medical platform to capture the shared NER knowledge. Its local gradients from different platforms are aggregated to update the global shared module, which is further delivered to each platform to update their local shared modules. Experiments on three publicly available datasets validate the effectiveness of our method.

Via

Access Paper or Ask Questions