Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shaoping Ma

Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System

Aug 17, 2022
Ziyi Ye, Xiaohui Xie, Yiqun Liu, Zhihong Wang, Xuesong Chen, Min Zhang, Shaoping Ma

Figure 1 for Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System

Figure 2 for Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System

Figure 3 for Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System

Figure 4 for Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System

With the growth of information on the Web, most users heavily rely on information access systems (e.g., search engines, recommender systems, etc.) in their daily lives. During this procedure, modeling users' satisfaction status plays an essential part in improving their experiences with the systems. In this paper, we aim to explore the benefits of using Electroencephalography (EEG) signals for satisfaction modeling in interactive information access system design. Different from existing EEG classification tasks, the arisen of satisfaction involves multiple brain functions, such as arousal, prototypicality, and appraisals, which are related to different brain topographical areas. Thus modeling user satisfaction raises great challenges to existing solutions. To address this challenge, we propose BTA, a Brain Topography Adaptive network with a multi-centrality encoding module and a spatial attention mechanism module to capture cognitive connectivities in different spatial distances. We explore the effectiveness of BTA for satisfaction modeling in two popular information access scenarios, i.e., search and recommendation. Extensive experiments on two real-world datasets verify the effectiveness of introducing brain topography adaptive strategy in satisfaction modeling. Furthermore, we also conduct search result re-ranking task and video rating prediction task based on the satisfaction inferred from brain signals on search and recommendation scenarios, respectively. Experimental results show that brain signals extracted with BTA help improve the performance of interactive information access systems significantly.

* Accepted by Multimedia 2022 (MM'22) as a full paper

Via

Access Paper or Ask Questions

Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Aug 11, 2022
Jingtao Zhan, Qingyao Ai, Yiqun Liu, Jiaxin Mao, Xiaohui Xie, Min Zhang, Shaoping Ma

Figure 1 for Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Figure 2 for Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Figure 3 for Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Figure 4 for Disentangled Modeling of Domain and Relevance for Adaptable Dense Retrieval

Recent advance in Dense Retrieval (DR) techniques has significantly improved the effectiveness of first-stage retrieval. Trained with large-scale supervised data, DR models can encode queries and documents into a low-dimensional dense space and conduct effective semantic matching. However, previous studies have shown that the effectiveness of DR models would drop by a large margin when the trained DR models are adopted in a target domain that is different from the domain of the labeled data. One of the possible reasons is that the DR model has never seen the target corpus and thus might be incapable of mitigating the difference between the training and target domains. In practice, unfortunately, training a DR model for each target domain to avoid domain shift is often a difficult task as it requires additional time, storage, and domain-specific data labeling, which are not always available. To address this problem, in this paper, we propose a novel DR framework named Disentangled Dense Retrieval (DDR) to support effective and flexible domain adaptation for DR models. DDR consists of a Relevance Estimation Module (REM) for modeling domain-invariant matching patterns and several Domain Adaption Modules (DAMs) for modeling domain-specific features of multiple target corpora. By making the REM and DAMs disentangled, DDR enables a flexible training paradigm in which REM is trained with supervision once and DAMs are trained with unsupervised data. Comprehensive experiments in different domains and languages show that DDR significantly improves ranking performance compared to strong DR baselines and substantially outperforms traditional retrieval methods in most scenarios.

* Preprint

Via

Access Paper or Ask Questions

Towards Representation Alignment and Uniformity in Collaborative Filtering

Jun 26, 2022
Chenyang Wang, Yuanqing Yu, Weizhi Ma, Min Zhang, Chong Chen, Yiqun Liu, Shaoping Ma

Figure 1 for Towards Representation Alignment and Uniformity in Collaborative Filtering

Figure 2 for Towards Representation Alignment and Uniformity in Collaborative Filtering

Figure 3 for Towards Representation Alignment and Uniformity in Collaborative Filtering

Figure 4 for Towards Representation Alignment and Uniformity in Collaborative Filtering

Collaborative filtering (CF) plays a critical role in the development of recommender systems. Most CF methods utilize an encoder to embed users and items into the same representation space, and the Bayesian personalized ranking (BPR) loss is usually adopted as the objective function to learn informative encoders. Existing studies mainly focus on designing more powerful encoders (e.g., graph neural network) to learn better representations. However, few efforts have been devoted to investigating the desired properties of representations in CF, which is important to understand the rationale of existing CF methods and design new learning objectives. In this paper, we measure the representation quality in CF from the perspective of alignment and uniformity on the hypersphere. We first theoretically reveal the connection between the BPR loss and these two properties. Then, we empirically analyze the learning dynamics of typical CF methods in terms of quantified alignment and uniformity, which shows that better alignment or uniformity both contribute to higher recommendation performance. Based on the analyses results, a learning objective that directly optimizes these two properties is proposed, named DirectAU. We conduct extensive experiments on three public datasets, and the proposed learning framework with a simple matrix factorization model leads to significant performance improvements compared to state-of-the-art CF methods. Our implementations are publicly available at https://github.com/THUwangcy/DirectAU.

* Accepted by KDD'2022

Via

Access Paper or Ask Questions

A Survey on the Fairness of Recommender Systems

Jun 19, 2022
Yifan Wang, Weizhi Ma, Min Zhang, Yiqun Liu, Shaoping Ma

Figure 1 for A Survey on the Fairness of Recommender Systems

Figure 2 for A Survey on the Fairness of Recommender Systems

Figure 3 for A Survey on the Fairness of Recommender Systems

Figure 4 for A Survey on the Fairness of Recommender Systems

Recommender systems are an essential tool to relieve the information overload challenge and play an important role in people's daily lives. Since recommendations involve allocations of social resources (e.g., job recommendation), an important issue is whether recommendations are fair. Unfair recommendations are not only unethical but also harm the long-term interests of the recommender system itself. As a result, fairness issues in recommender systems have recently attracted increasing attention. However, due to multiple complex resource allocation processes and various fairness definitions, the research on fairness in recommendation is scattered. To fill this gap, we review over 60 papers published in top conferences/journals, including TOIS, SIGIR, and WWW. First, we summarize fairness definitions in the recommendation and provide several views to classify fairness issues. Then, we review recommendation datasets and measurements in fairness studies and provide an elaborate taxonomy of fairness methods in the recommendation. Finally, we conclude this survey by outlining some promising future directions.

* Submitted to the Special Section on Trustworthy Recommendation and Search of ACM TOIS on March 27, 2022 and accepted on June 6

Via

Access Paper or Ask Questions

Evaluating Extrapolation Performance of Dense Retrieval

Apr 25, 2022
Jingtao Zhan, Xiaohui Xie, Jiaxin Mao, Yiqun Liu, Min Zhang, Shaoping Ma

Figure 1 for Evaluating Extrapolation Performance of Dense Retrieval

Figure 2 for Evaluating Extrapolation Performance of Dense Retrieval

Figure 3 for Evaluating Extrapolation Performance of Dense Retrieval

Figure 4 for Evaluating Extrapolation Performance of Dense Retrieval

A retrieval model should not only interpolate the training data but also extrapolate well to the queries that are rather different from the training data. While dense retrieval (DR) models have been demonstrated to achieve better retrieval performance than the traditional term-based retrieval models, we still know little about whether they can extrapolate. To shed light on the research question, we investigate how DR models perform in both the interpolation and extrapolation regimes. We first investigate the distribution of training and test data on popular retrieval benchmarks and identify a considerable overlap in query entities, query intent, and relevance labels. This finding implies that the performance on these test sets is biased towards interpolation and cannot accurately reflect the extrapolation capacity. Therefore, to evaluate the extrapolation performance of DR models, we propose two resampling strategies for existing retrieval benchmarks and comprehensively investigate how DR models perform. Results show that DR models may interpolate as well as complex interaction-based models (e.g., BERT and ColBERT) but extrapolate substantially worse. Among various DR training strategies, text-encoding pretraining and target-domain pretraining are particularly effective for improving the extrapolation capacity. Finally, we compare the extrapolation capacity with domain transfer ability. Despite its simplicity and ease of use, the extrapolation performance can reflect the domain transfer ability in some domains of the BEIR dataset, further highlighting the feasibility of our approaches in evaluating the generalizability of DR models.

Via

Access Paper or Ask Questions

ConvSearch: A Open-Domain Conversational Search Behavior Dataset

Apr 06, 2022
Zhumin Chu, Zhihong Wang, Yiqun Liu, Yingye Huang, Min Zhang, Shaoping Ma

Figure 1 for ConvSearch: A Open-Domain Conversational Search Behavior Dataset

Figure 2 for ConvSearch: A Open-Domain Conversational Search Behavior Dataset

Figure 3 for ConvSearch: A Open-Domain Conversational Search Behavior Dataset

Figure 4 for ConvSearch: A Open-Domain Conversational Search Behavior Dataset

Conversational Search has been paid much attention recently with the increasing popularity of intelligent user interfaces. However, compared with the endeavour in designing effective conversational search algorithms, relatively much fewer researchers have focused on the construction of benchmark datasets. For most existing datasets, the information needs are defined by researchers and search requests are not proposed by actual users. Meanwhile, these datasets usually focus on the conversations between users and agents (systems), while largely ignores the search behaviors of agents before they return response to users. To overcome these problems, we construct a Chinese Open-Domain Conversational Search Behavior Dataset (ConvSearch) based on Wizard-of-Oz paradigm in the field study scenario. We develop a novel conversational search platform to collect dialogue contents, annotate dialogue quality and candidate search results and record agent search behaviors. 25 search agents and 51 users are recruited for the field study that lasts about 45 days. The ConvSearch dataset contains 1,131 dialogues together with annotated search results and corresponding search behaviors. We also provide the intent labels of each search behavior iteration to support intent understanding related researches. The dataset is already open to public for academic usage.

* 10 pages

Via

Access Paper or Ask Questions

A Survey on Dropout Methods and Experimental Verification in Recommendation

Apr 05, 2022
Yangkun Li, Weizhi Ma, Chong Chen, Min Zhang, Yiqun Liu, Shaoping Ma, Yuekui Yang

Figure 1 for A Survey on Dropout Methods and Experimental Verification in Recommendation

Figure 2 for A Survey on Dropout Methods and Experimental Verification in Recommendation

Figure 3 for A Survey on Dropout Methods and Experimental Verification in Recommendation

Figure 4 for A Survey on Dropout Methods and Experimental Verification in Recommendation

Overfitting is a common problem in machine learning, which means the model too closely fits the training data while performing poorly in the test data. Among various methods of coping with overfitting, dropout is one of the representative ways. From randomly dropping neurons to dropping neural structures, dropout has achieved great success in improving model performances. Although various dropout methods have been designed and widely applied in past years, their effectiveness, application scenarios, and contributions have not been comprehensively summarized and empirically compared by far. It is the right time to make a comprehensive survey. In this paper, we systematically review previous dropout methods and classify them into three major categories according to the stage where dropout operation is performed. Specifically, more than seventy dropout methods published in top AI conferences or journals (e.g., TKDE, KDD, TheWebConf, SIGIR) are involved. The designed taxonomy is easy to understand and capable of including new dropout methods. Then, we further discuss their application scenarios, connections, and contributions. To verify the effectiveness of distinct dropout methods, extensive experiments are conducted on recommendation scenarios with abundant heterogeneous information. Finally, we propose some open problems and potential research directions about dropout that worth to be further explored.

* 26 pages

Via

Access Paper or Ask Questions

Interpreting Dense Retrieval as Mixture of Topics

Nov 27, 2021
Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Jiafeng Guo, Min Zhang, Shaoping Ma

Figure 1 for Interpreting Dense Retrieval as Mixture of Topics

Figure 2 for Interpreting Dense Retrieval as Mixture of Topics

Figure 3 for Interpreting Dense Retrieval as Mixture of Topics

Figure 4 for Interpreting Dense Retrieval as Mixture of Topics

Dense Retrieval (DR) reaches state-of-the-art results in first-stage retrieval, but little is known about the mechanisms that contribute to its success. Therefore, in this work, we conduct an interpretation study of recently proposed DR models. Specifically, we first discretize the embeddings output by the document and query encoders. Based on the discrete representations, we analyze the attribution of input tokens. Both qualitative and quantitative experiments are carried out on public test collections. Results suggest that DR models pay attention to different aspects of input and extract various high-level topic representations. Therefore, we can regard the representations learned by DR models as a mixture of high-level topics.

Via

Access Paper or Ask Questions

Web Search via an Efficient and Effective Brain-Machine Interface

Oct 15, 2021
Xuesong Chen, Ziyi Ye, Xiaohui Xie, Yiqun Liu, Weihang Su, Shuqi Zhu, Min Zhang, Shaoping Ma

Figure 1 for Web Search via an Efficient and Effective Brain-Machine Interface

Figure 2 for Web Search via an Efficient and Effective Brain-Machine Interface

Figure 3 for Web Search via an Efficient and Effective Brain-Machine Interface

While search technologies have evolved to be robust and ubiquitous, the fundamental interaction paradigm has remained relatively stable for decades. With the maturity of the Brain-Machine Interface, we build an efficient and effective communication system between human beings and search engines based on electroencephalogram(EEG) signals, called Brain-Machine Search Interface(BMSI) system. The BMSI system provides functions including query reformulation and search result interaction. In our system, users can perform search tasks without having to use the mouse and keyboard. Therefore, it is useful for application scenarios in which hand-based interactions are infeasible, e.g, for users with severe neuromuscular disorders. Besides, based on brain signals decoding, our system can provide abundant and valuable user-side context information(e.g., real-time satisfaction feedback, extensive context information, and a clearer description of information needs) to the search engine, which is hard to capture in the previous paradigm. In our implementation, the system can decode user satisfaction from brain signals in real-time during the interaction process and re-rank the search results list based on user satisfaction feedback. The demo video is available at http://www.thuir.cn/group/YQLiu/datasets/BMSISystem.mp4.

Via

Access Paper or Ask Questions