Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Philip S. Yu

University of Illinois at Chicago

Decentralized Federated Learning: A Survey and Perspective

Jun 02, 2023

Liangqi Yuan, Lichao Sun, Philip S. Yu, Ziran Wang

Figure 1 for Decentralized Federated Learning: A Survey and Perspective

Figure 2 for Decentralized Federated Learning: A Survey and Perspective

Figure 3 for Decentralized Federated Learning: A Survey and Perspective

Figure 4 for Decentralized Federated Learning: A Survey and Perspective

Abstract:Federated learning (FL) has been gaining attention for its ability to share knowledge while maintaining user data, protecting privacy, increasing learning efficiency, and reducing communication overhead. Decentralized FL (DFL) is a decentralized network architecture that eliminates the need for a central server in contrast to centralized FL (CFL). DFL enables direct communication between clients, resulting in significant savings in communication resources. In this paper, a comprehensive survey and profound perspective is provided for DFL. First, a review of the methodology, challenges, and variants of CFL is conducted, laying the background of DFL. Then, a systematic and detailed perspective on DFL is introduced, including iteration order, communication protocols, network topologies, paradigm proposals, and temporal variability. Next, based on the definition of DFL, several extended variants and categorizations are proposed with state-of-the-art technologies. Lastly, in addition to summarizing the current challenges in the DFL, some possible solutions and future research directions are also discussed.

Via

Access Paper or Ask Questions

GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

May 26, 2023

Xuming Hu, Aiwei Liu, Zeqi Tan, Xin Zhang, Chenwei Zhang, Irwin King, Philip S. Yu

Figure 1 for GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Figure 2 for GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Figure 3 for GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Figure 4 for GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Abstract:Relation extraction (RE) tasks show promising performance in extracting relations from two entities mentioned in sentences, given sufficient annotations available during training. Such annotations would be labor-intensive to obtain in practice. Existing work adopts data augmentation techniques to generate pseudo-annotated sentences beyond limited annotations. These techniques neither preserve the semantic consistency of the original sentences when rule-based augmentations are adopted, nor preserve the syntax structure of sentences when expressing relations using seq2seq models, resulting in less diverse augmentations. In this work, we propose a dedicated augmentation technique for relational texts, named GDA, which uses two complementary modules to preserve both semantic consistency and syntax structures. We adopt a generative formulation and design a multi-tasking solution to achieve synergies. Furthermore, GDA adopts entity hints as the prior knowledge of the generative model to augment diverse sentences. Experimental results in three datasets under a low-resource setting showed that GDA could bring {\em 2.0\%} F1 improvements compared with no augmentation technique. Source code and data are available.

* ACL 2023
* Accepted to ACL 2023 (Findings), Long Paper, 12 pages

Via

Access Paper or Ask Questions

Multimodal Relation Extraction with Cross-Modal Retrieval and Synthesis

May 25, 2023

Xuming Hu, Zhijiang Guo, Zhiyang Teng, Irwin King, Philip S. Yu

Abstract:Multimodal relation extraction (MRE) is the task of identifying the semantic relationships between two entities based on the context of the sentence image pair. Existing retrieval-augmented approaches mainly focused on modeling the retrieved textual knowledge, but this may not be able to accurately identify complex relations. To improve the prediction, this research proposes to retrieve textual and visual evidence based on the object, sentence, and whole image. We further develop a novel approach to synthesize the object-level, image-level, and sentence-level information for better reasoning between the same and different modalities. Extensive experiments and analyses show that the proposed method is able to effectively select and compare evidence across modalities and significantly outperforms state-of-the-art models.

* Accepted to ACL 2023

Via

Access Paper or Ask Questions

Give Me More Details: Improving Fact-Checking with Latent Retrieval

May 25, 2023

Xuming Hu, Zhijiang Guo, Guanyu Wu, Lijie Wen, Philip S. Yu

Abstract:Evidence plays a crucial role in automated fact-checking. When verifying real-world claims, existing fact-checking systems either assume the evidence sentences are given or use the search snippets returned by the search engine. Such methods ignore the challenges of collecting evidence and may not provide sufficient information to verify real-world claims. Aiming at building a better fact-checking system, we propose to incorporate full text from source documents as evidence and introduce two enriched datasets. The first one is a multilingual dataset, while the second one is monolingual (English). We further develop a latent variable model to jointly extract evidence sentences from documents and perform claim verification. Experiments indicate that including source documents can provide sufficient contextual clues even when gold evidence sentences are not annotated. The proposed system is able to achieve significant improvements upon best-reported models under different settings.

* Accepted to ACL 2023, 15 pages

Via

Access Paper or Ask Questions

Enhancing Cross-lingual Natural Language Inference by Soft Prompting with Multilingual Verbalizer

May 22, 2023

Shuang Li, Xuming Hu, Aiwei Liu, Yawen Yang, Fukun Ma, Philip S. Yu, Lijie Wen

Abstract:Cross-lingual natural language inference is a fundamental problem in cross-lingual language understanding. Many recent works have used prompt learning to address the lack of annotated parallel corpora in XNLI. However, these methods adopt discrete prompting by simply translating the templates to the target language and need external expert knowledge to design the templates. Besides, discrete prompts of human-designed template words are not trainable vectors and can not be migrated to target languages in the inference stage flexibly. In this paper, we propose a novel Soft prompt learning framework with the Multilingual Verbalizer (SoftMV) for XNLI. SoftMV first constructs cloze-style question with soft prompts for the input sample. Then we leverage bilingual dictionaries to generate an augmented multilingual question for the original question. SoftMV adopts a multilingual verbalizer to align the representations of original and augmented multilingual questions into the same semantic space with consistency regularization. Experimental results on XNLI demonstrate that SoftMV can achieve state-of-the-art performance and significantly outperform the previous methods under the few-shot and full-shot cross-lingual transfer settings.

* Accept at ACL2023

Via

Access Paper or Ask Questions

Gaussian Prior Reinforcement Learning for Nested Named Entity Recognition

May 12, 2023

Yawen Yang, Xuming Hu, Fukun Ma, Shu'ang Li, Aiwei Liu, Lijie Wen, Philip S. Yu

Abstract:Named Entity Recognition (NER) is a well and widely studied task in natural language processing. Recently, the nested NER has attracted more attention since its practicality and difficulty. Existing works for nested NER ignore the recognition order and boundary position relation of nested entities. To address these issues, we propose a novel seq2seq model named GPRL, which formulates the nested NER task as an entity triplet sequence generation process. GPRL adopts the reinforcement learning method to generate entity triplets decoupling the entity order in gold labels and expects to learn a reasonable recognition order of entities via trial and error. Based on statistics of boundary distance for nested entities, GPRL designs a Gaussian prior to represent the boundary distance distribution between nested entities and adjust the output probability distribution of nested boundary tokens. Experiments on three nested NER datasets demonstrate that GPRL outperforms previous nested NER models.

* Accepted by ICASSP 2023

Via

Access Paper or Ask Questions

Zero-shot Item-based Recommendation via Multi-task Product Knowledge Graph Pre-Training

May 12, 2023

Ziwei Fan, Zhiwei Liu, Shelby Heinecke, Jianguo Zhang, Huan Wang, Caiming Xiong, Philip S. Yu

Abstract:Existing recommender systems face difficulties with zero-shot items, i.e. items that have no historical interactions with users during the training stage. Though recent works extract universal item representation via pre-trained language models (PLMs), they ignore the crucial item relationships. This paper presents a novel paradigm for the Zero-Shot Item-based Recommendation (ZSIR) task, which pre-trains a model on product knowledge graph (PKG) to refine the item features from PLMs. We identify three challenges for pre-training PKG, which are multi-type relations in PKG, semantic divergence between item generic information and relations and domain discrepancy from PKG to downstream ZSIR task. We address the challenges by proposing four pre-training tasks and novel task-oriented adaptation (ToA) layers. Moreover, this paper discusses how to fine-tune the model on new recommendation task such that the ToA layers are adapted to ZSIR task. Comprehensive experiments on 18 markets dataset are conducted to verify the effectiveness of the proposed model in both knowledge prediction and ZSIR task.

* 11 pages

Via

Access Paper or Ask Questions

Contrastive Graph Clustering in Curvature Spaces

May 05, 2023

Li Sun, Feiyang Wang, Junda Ye, Hao Peng, Philip S. Yu

Abstract:Graph clustering is a longstanding research topic, and has achieved remarkable success with the deep learning methods in recent years. Nevertheless, we observe that several important issues largely remain open. On the one hand, graph clustering from the geometric perspective is appealing but has rarely been touched before, as it lacks a promising space for geometric clustering. On the other hand, contrastive learning boosts the deep graph clustering but usually struggles in either graph augmentation or hard sample mining. To bridge this gap, we rethink the problem of graph clustering from geometric perspective and, to the best of our knowledge, make the first attempt to introduce a heterogeneous curvature space to graph clustering problem. Correspondingly, we present a novel end-to-end contrastive graph clustering model named CONGREGATE, addressing geometric graph clustering with Ricci curvatures. To support geometric clustering, we construct a theoretically grounded Heterogeneous Curvature Space where deep representations are generated via the product of the proposed fully Riemannian graph convolutional nets. Thereafter, we train the graph clusters by an augmentation-free reweighted contrastive approach where we pay more attention to both hard negatives and hard positives in our curvature space. Empirical results on real-world graphs show that our model outperforms the state-of-the-art competitors.

* Accepted by IJCAI'23

Via

Access Paper or Ask Questions

Read it Twice: Towards Faithfully Interpretable Fact Verification by Revisiting Evidence

May 02, 2023

Xuming Hu, Zhaochen Hong, Zhijiang Guo, Lijie Wen, Philip S. Yu

Figure 1 for Read it Twice: Towards Faithfully Interpretable Fact Verification by Revisiting Evidence

Figure 2 for Read it Twice: Towards Faithfully Interpretable Fact Verification by Revisiting Evidence

Figure 3 for Read it Twice: Towards Faithfully Interpretable Fact Verification by Revisiting Evidence

Abstract:Real-world fact verification task aims to verify the factuality of a claim by retrieving evidence from the source document. The quality of the retrieved evidence plays an important role in claim verification. Ideally, the retrieved evidence should be faithful (reflecting the model's decision-making process in claim verification) and plausible (convincing to humans), and can improve the accuracy of verification task. Although existing approaches leverage the similarity measure of semantic or surface form between claims and documents to retrieve evidence, they all rely on certain heuristics that prevent them from satisfying all three requirements. In light of this, we propose a fact verification model named ReRead to retrieve evidence and verify claim that: (1) Train the evidence retriever to obtain interpretable evidence (i.e., faithfulness and plausibility criteria); (2) Train the claim verifier to revisit the evidence retrieved by the optimized evidence retriever to improve the accuracy. The proposed system is able to achieve significant improvements upon best-reported models under different settings.

* SIGIR 2023

Via

Access Paper or Ask Questions

Think Rationally about What You See: Continuous Rationale Extraction for Relation Extraction

May 02, 2023

Xuming Hu, Zhaochen Hong, Chenwei Zhang, Irwin King, Philip S. Yu

Figure 1 for Think Rationally about What You See: Continuous Rationale Extraction for Relation Extraction

Figure 2 for Think Rationally about What You See: Continuous Rationale Extraction for Relation Extraction

Figure 3 for Think Rationally about What You See: Continuous Rationale Extraction for Relation Extraction

Figure 4 for Think Rationally about What You See: Continuous Rationale Extraction for Relation Extraction

Abstract:Relation extraction (RE) aims to extract potential relations according to the context of two entities, thus, deriving rational contexts from sentences plays an important role. Previous works either focus on how to leverage the entity information (e.g., entity types, entity verbalization) to inference relations, but ignore context-focused content, or use counterfactual thinking to remove the model's bias of potential relations in entities, but the relation reasoning process will still be hindered by irrelevant content. Therefore, how to preserve relevant content and remove noisy segments from sentences is a crucial task. In addition, retained content needs to be fluent enough to maintain semantic coherence and interpretability. In this work, we propose a novel rationale extraction framework named RE2, which leverages two continuity and sparsity factors to obtain relevant and coherent rationales from sentences. To solve the problem that the gold rationales are not labeled, RE2 applies an optimizable binary mask to each token in the sentence, and adjust the rationales that need to be selected according to the relation label. Experiments on four datasets show that RE2 surpasses baselines.

* SIGIR 2023

Via

Access Paper or Ask Questions