Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Philip S. Yu

University of Illinois at Chicago

Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce

Jul 29, 2023

Yibo Wang, Yanbing Xue, Bo Liu, Musen Wen, Wenting Zhao, Stephen Guo, Philip S. Yu

Figure 1 for Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce

Figure 2 for Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce

Figure 3 for Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce

Figure 4 for Click-Conversion Multi-Task Model with Position Bias Mitigation for Sponsored Search in eCommerce

Abstract:Position bias, the phenomenon whereby users tend to focus on higher-ranked items of the search result list regardless of the actual relevance to queries, is prevailing in many ranking systems. Position bias in training data biases the ranking model, leading to increasingly unfair item rankings, click-through-rate (CTR), and conversion rate (CVR) predictions. To jointly mitigate position bias in both item CTR and CVR prediction, we propose two position-bias-free CTR and CVR prediction models: Position-Aware Click-Conversion (PACC) and PACC via Position Embedding (PACC-PE). PACC is built upon probability decomposition and models position information as a probability. PACC-PE utilizes neural networks to model product-specific position information as embedding. Experiments on the E-commerce sponsored product search dataset show that our proposed models have better ranking effectiveness and can greatly alleviate position bias in both CTR and CVR prediction.

* In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1884-1888. 2023
* Modified some typos of the published SIGIR version

Via

Access Paper or Ask Questions

A Survey on Evaluation of Large Language Models

Jul 18, 2023

Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang(+6 more)

Abstract:Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, educations, natural and social sciences, agent applications, and other areas. Secondly, we answer the `where' and `how' questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey.

* 25 pages; more work is at: https://llm-eval.github.io/

Via

Access Paper or Ask Questions

Enhancing Cross-lingual Transfer via Phonemic Transcription Integration

Jul 10, 2023

Hoang H. Nguyen, Chenwei Zhang, Tao Zhang, Eugene Rohrbaugh, Philip S. Yu

Abstract:Previous cross-lingual transfer methods are restricted to orthographic representation learning via textual scripts. This limitation hampers cross-lingual transfer and is biased towards languages sharing similar well-known scripts. To alleviate the gap between languages from different writing scripts, we propose PhoneXL, a framework incorporating phonemic transcriptions as an additional linguistic modality beyond the traditional orthographic transcriptions for cross-lingual transfer. Particularly, we propose unsupervised alignment objectives to capture (1) local one-to-one alignment between the two different modalities, (2) alignment via multi-modality contexts to leverage information from additional modalities, and (3) alignment via multilingual contexts where additional bilingual dictionaries are incorporated. We also release the first phonemic-orthographic alignment dataset on two token-level tasks (Named Entity Recognition and Part-of-Speech Tagging) among the understudied but interconnected Chinese-Japanese-Korean-Vietnamese (CJKV) languages. Our pilot study reveals phonemic transcription provides essential information beyond the orthography to enhance cross-lingual transfer and bridge the gap among CJKV languages, leading to consistent improvements on cross-lingual token-level tasks over orthographic-based multilingual PLMs.

* 11 pages,1 figure, 7 tables. To appear in Findings of ACL 2023

Via

Access Paper or Ask Questions

Dimension Independent Mixup for Hard Negative Sample in Collaborative Filtering

Jun 28, 2023

Xi Wu, Liangwei Yang, Jibing Gong, Chao Zhou, Tianyu Lin, Xiaolong Liu, Philip S. Yu

Figure 1 for Dimension Independent Mixup for Hard Negative Sample in Collaborative Filtering

Figure 2 for Dimension Independent Mixup for Hard Negative Sample in Collaborative Filtering

Figure 3 for Dimension Independent Mixup for Hard Negative Sample in Collaborative Filtering

Figure 4 for Dimension Independent Mixup for Hard Negative Sample in Collaborative Filtering

Abstract:Collaborative filtering (CF) is a widely employed technique that predicts user preferences based on past interactions. Negative sampling plays a vital role in training CF-based models with implicit feedback. In this paper, we propose a novel perspective based on the sampling area to revisit existing sampling methods. We point out that current sampling methods mainly focus on Point-wise or Line-wise sampling, lacking flexibility and leaving a significant portion of the hard sampling area un-explored. To address this limitation, we propose Dimension Independent Mixup for Hard Negative Sampling (DINS), which is the first Area-wise sampling method for training CF-based models. DINS comprises three modules: Hard Boundary Definition, Dimension Independent Mixup, and Multi-hop Pooling. Experiments with real-world datasets on both matrix factorization and graph-based models demonstrate that DINS outperforms other negative sampling methods, establishing its effectiveness and superiority. Our work contributes a new perspective, introduces Area-wise sampling, and presents DINS as a novel approach that achieves state-of-the-art performance for negative sampling. Our implementations are available in PyTorch.

Via

Access Paper or Ask Questions

Multi-task Item-attribute Graph Pre-training for Strict Cold-start Item Recommendation

Jun 26, 2023

Yuwei Cao, Liangwei Yang, Chen Wang, Zhiwei Liu, Hao Peng, Chenyu You, Philip S. Yu

Abstract:Recommendation systems suffer in the strict cold-start (SCS) scenario, where the user-item interactions are entirely unavailable. The ID-based approaches completely fail to work. Cold-start recommenders, on the other hand, leverage item contents to map the new items to the existing ones. However, the existing SCS recommenders explore item contents in coarse-grained manners that introduce noise or information loss. Moreover, informative data sources other than item contents, such as users' purchase sequences and review texts, are ignored. We explore the role of the fine-grained item attributes in bridging the gaps between the existing and the SCS items and pre-train a knowledgeable item-attribute graph for SCS item recommendation. Our proposed framework, ColdGPT, models item-attribute correlations into an item-attribute graph by extracting fine-grained attributes from item contents. ColdGPT then transfers knowledge into the item-attribute graph from various available data sources, i.e., item contents, historical purchase sequences, and review texts of the existing items, via multi-task learning. To facilitate the positive transfer, ColdGPT designs submodules according to the natural forms of the data sources and coordinates the multiple pre-training tasks via unified alignment-and-uniformity losses. Our pre-trained item-attribute graph acts as an implicit, extendable item embedding matrix, which enables the SCS item embeddings to be easily acquired by inserting these items and propagating their attributes' embeddings. We carefully process three public datasets, i.e., Yelp, Amazon-home, and Amazon-sports, to guarantee the SCS setting for evaluation. Extensive experiments show that ColdGPT consistently outperforms the existing SCS recommenders by large margins and even surpasses models that are pre-trained on 75-224 times more, cross-domain data on two out of four datasets.

* This work has been accepted as a FULL paper in RecSys 2023

Via

Access Paper or Ask Questions

Privacy and Fairness in Federated Learning: on the Perspective of Trade-off

Jun 25, 2023

Huiqiang Chen, Tianqing Zhu, Tao Zhang, Wanlei Zhou, Philip S. Yu

Figure 1 for Privacy and Fairness in Federated Learning: on the Perspective of Trade-off

Figure 2 for Privacy and Fairness in Federated Learning: on the Perspective of Trade-off

Figure 3 for Privacy and Fairness in Federated Learning: on the Perspective of Trade-off

Figure 4 for Privacy and Fairness in Federated Learning: on the Perspective of Trade-off

Abstract:Federated learning (FL) has been a hot topic in recent years. Ever since it was introduced, researchers have endeavored to devise FL systems that protect privacy or ensure fair results, with most research focusing on one or the other. As two crucial ethical notions, the interactions between privacy and fairness are comparatively less studied. However, since privacy and fairness compete, considering each in isolation will inevitably come at the cost of the other. To provide a broad view of these two critical topics, we presented a detailed literature review of privacy and fairness issues, highlighting unique challenges posed by FL and solutions in federated settings. We further systematically surveyed different interactions between privacy and fairness, trying to reveal how privacy and fairness could affect each other and point out new research directions in fair and private FL.

Via

Access Paper or Ask Questions

Addressing the Rank Degeneration in Sequential Recommendation via Singular Spectrum Smoothing

Jun 21, 2023

Ziwei Fan, Zhiwei Liu, Hao Peng, Philip S. Yu

Figure 1 for Addressing the Rank Degeneration in Sequential Recommendation via Singular Spectrum Smoothing

Figure 2 for Addressing the Rank Degeneration in Sequential Recommendation via Singular Spectrum Smoothing

Figure 3 for Addressing the Rank Degeneration in Sequential Recommendation via Singular Spectrum Smoothing

Figure 4 for Addressing the Rank Degeneration in Sequential Recommendation via Singular Spectrum Smoothing

Abstract:Sequential recommendation (SR) investigates the dynamic user preferences modeling and generates the next-item prediction. The next item preference is typically generated by the affinity between the sequence and item representations. However, both sequence and item representations suffer from the rank degeneration issue due to the data sparsity problem. The rank degeneration issue significantly impairs the representations for SR. This motivates us to measure how severe is the rank degeneration issue and alleviate the sequence and item representation rank degeneration issues simultaneously for SR. In this work, we theoretically connect the sequence representation degeneration issue with the item rank degeneration, particularly for short sequences and cold items. We also identify the connection between the fast singular value decay phenomenon and the rank collapse issue in transformer sequence output and item embeddings. We propose the area under the singular value curve metric to evaluate the severity of the singular value decay phenomenon and use it as an indicator of rank degeneration. We further introduce a novel singular spectrum smoothing regularization to alleviate the rank degeneration on both sequence and item sides, which is the Singular sPectrum sMoothing for sequential Recommendation (SPMRec). We also establish a correlation between the ranks of sequence and item embeddings and the rank of the user-item preference prediction matrix, which can affect recommendation diversity. We conduct experiments on four benchmark datasets to demonstrate the superiority of SPMRec over the state-of-the-art recommendation methods, especially in short sequences. The experiments also demonstrate a strong connection between our proposed singular spectrum smoothing and recommendation diversity.

* 18 pages, regularizations on preserving embedding rank are surrogates of intra-list recommendation diversity (controllable diversity). The code is in https://github.com/zfan20/SPMRec

Via

Access Paper or Ask Questions

Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

Jun 19, 2023

Mengzhu Sun, Xi Zhang, Jianqiang Ma, Sihong Xie, Yazheng Liu, Philip S. Yu

Figure 1 for Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

Figure 2 for Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

Figure 3 for Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

Figure 4 for Inconsistent Matters: A Knowledge-guided Dual-consistency Network for Multi-modal Rumor Detection

Abstract:Rumor spreaders are increasingly utilizing multimedia content to attract the attention and trust of news consumers. Though quite a few rumor detection models have exploited the multi-modal data, they seldom consider the inconsistent semantics between images and texts, and rarely spot the inconsistency among the post contents and background knowledge. In addition, they commonly assume the completeness of multiple modalities and thus are incapable of handling handle missing modalities in real-life scenarios. Motivated by the intuition that rumors in social media are more likely to have inconsistent semantics, a novel Knowledge-guided Dual-consistency Network is proposed to detect rumors with multimedia contents. It uses two consistency detection subnetworks to capture the inconsistency at the cross-modal level and the content-knowledge level simultaneously. It also enables robust multi-modal representation learning under different missing visual modality conditions, using a special token to discriminate between posts with visual modality and posts without visual modality. Extensive experiments on three public real-world multimedia datasets demonstrate that our framework can outperform the state-of-the-art baselines under both complete and incomplete modality conditions. Our codes are available at https://github.com/MengzSun/KDCN.

* IEEE Transactions on Knowledge and Data Engineering, 2023

Via

Access Paper or Ask Questions

Comprehensive evaluation of deep and graph learning on drug-drug interactions prediction

Jun 08, 2023

Xuan Lin, Lichang Dai, Yafang Zhou, Zu-Guo Yu, Wen Zhang, Jian-Yu Shi, Dong-Sheng Cao, Li Zeng, Haowen Chen, Bosheng Song(+2 more)

Abstract:Recent advances and achievements of artificial intelligence (AI) as well as deep and graph learning models have established their usefulness in biomedical applications, especially in drug-drug interactions (DDIs). DDIs refer to a change in the effect of one drug to the presence of another drug in the human body, which plays an essential role in drug discovery and clinical research. DDIs prediction through traditional clinical trials and experiments is an expensive and time-consuming process. To correctly apply the advanced AI and deep learning, the developer and user meet various challenges such as the availability and encoding of data resources, and the design of computational methods. This review summarizes chemical structure based, network based, NLP based and hybrid methods, providing an updated and accessible guide to the broad researchers and development community with different domain knowledge. We introduce widely-used molecular representation and describe the theoretical frameworks of graph neural network models for representing molecular structures. We present the advantages and disadvantages of deep and graph learning methods by performing comparative experiments. We discuss the potential technical challenges and highlight future directions of deep and graph learning models for accelerating DDIs prediction.

* Accepted by Briefings in Bioinformatics

Via

Access Paper or Ask Questions

Machine Unlearning: A Survey

Jun 06, 2023

Heng Xu, Tianqing Zhu, Lefeng Zhang, Wanlei Zhou, Philip S. Yu

Figure 1 for Machine Unlearning: A Survey

Figure 2 for Machine Unlearning: A Survey

Figure 3 for Machine Unlearning: A Survey

Figure 4 for Machine Unlearning: A Survey

Abstract:Machine learning has attracted widespread attention and evolved into an enabling technology for a wide range of highly successful applications, such as intelligent computer vision, speech recognition, medical diagnosis, and more. Yet a special need has arisen where, due to privacy, usability, and/or the right to be forgotten, information about some specific samples needs to be removed from a model, called machine unlearning. This emerging technology has drawn significant interest from both academics and industry due to its innovation and practicality. At the same time, this ambitious problem has led to numerous research efforts aimed at confronting its challenges. To the best of our knowledge, no study has analyzed this complex topic or compared the feasibility of existing unlearning solutions in different kinds of scenarios. Accordingly, with this survey, we aim to capture the key concepts of unlearning techniques. The existing solutions are classified and summarized based on their characteristics within an up-to-date and comprehensive review of each category's advantages and limitations. The survey concludes by highlighting some of the outstanding issues with unlearning techniques, along with some feasible directions for new research opportunities.

Via

Access Paper or Ask Questions