Alert button
Picture for Mark Sanderson

Mark Sanderson

Alert button

i-Align: an interpretable knowledge graph alignment model

Aug 26, 2023
Bayu Distiawan Trisedya, Flora D Salim, Jeffrey Chan, Damiano Spina, Falk Scholer, Mark Sanderson

Knowledge graphs (KGs) are becoming essential resources for many downstream applications. However, their incompleteness may limit their potential. Thus, continuous curation is needed to mitigate this problem. One of the strategies to address this problem is KG alignment, i.e., forming a more complete KG by merging two or more KGs. This paper proposes i-Align, an interpretable KG alignment model. Unlike the existing KG alignment models, i-Align provides an explanation for each alignment prediction while maintaining high alignment performance. Experts can use the explanation to check the correctness of the alignment prediction. Thus, the high quality of a KG can be maintained during the curation process (e.g., the merging process of two KGs). To this end, a novel Transformer-based Graph Encoder (Trans-GE) is proposed as a key component of i-Align for aggregating information from entities' neighbors (structures). Trans-GE uses Edge-gated Attention that combines the adjacency matrix and the self-attention matrix to learn a gating mechanism to control the information aggregation from the neighboring entities. It also uses historical embeddings, allowing Trans-GE to be trained over mini-batches, or smaller sub-graphs, to address the scalability issue when encoding a large KG. Another component of i-Align is a Transformer encoder for aggregating entities' attributes. This way, i-Align can generate explanations in the form of a set of the most influential attributes/neighbors based on attention weights. Extensive experiments are conducted to show the power of i-Align. The experiments include several aspects, such as the model's effectiveness for aligning KGs, the quality of the generated explanations, and its practicality for aligning large KGs. The results show the effectiveness of i-Align in these aspects.

* Data Min Knowl Disc (2023) 
Viaarxiv icon

Designing and Evaluating Presentation Strategies for Fact-Checked Content

Aug 20, 2023
Danula Hettiachchi, Kaixin Ji, Jenny Kennedy, Anthony McCosker, Flora Dylis Salim, Mark Sanderson, Falk Scholer, Damiano Spina

With the rapid growth of online misinformation, it is crucial to have reliable fact-checking methods. Recent research on finding check-worthy claims and automated fact-checking have made significant advancements. However, limited guidance exists regarding the presentation of fact-checked content to effectively convey verified information to users. We address this research gap by exploring the critical design elements in fact-checking reports and investigating whether credibility and presentation-based design improvements can enhance users' ability to interpret the report accurately. We co-developed potential content presentation strategies through a workshop involving fact-checking professionals, communication experts, and researchers. The workshop examined the significance and utility of elements such as veracity indicators and explored the feasibility of incorporating interactive components for enhanced information disclosure. Building on the workshop outcomes, we conducted an online experiment involving 76 crowd workers to assess the efficacy of different design strategies. The results indicate that proposed strategies significantly improve users' ability to accurately interpret the verdict of fact-checking articles. Our findings underscore the critical role of effective presentation of fact reports in addressing the spread of misinformation. By adopting appropriate design enhancements, the effectiveness of fact-checking reports can be maximized, enabling users to make informed judgments.

* Accepted to the 32nd ACM International Conference on Information and Knowledge Management (CIKM '23) 
Viaarxiv icon

More Is Less: When Do Recommenders Underperform for Data-rich Users?

Apr 15, 2023
Yueqing Xuan, Kacper Sokol, Jeffrey Chan, Mark Sanderson

Figure 1 for More Is Less: When Do Recommenders Underperform for Data-rich Users?
Figure 2 for More Is Less: When Do Recommenders Underperform for Data-rich Users?
Figure 3 for More Is Less: When Do Recommenders Underperform for Data-rich Users?
Figure 4 for More Is Less: When Do Recommenders Underperform for Data-rich Users?

Users of recommender systems tend to differ in their level of interaction with these algorithms, which may affect the quality of recommendations they receive and lead to undesirable performance disparity. In this paper we investigate under what conditions the performance for data-rich and data-poor users diverges for a collection of popular evaluation metrics applied to ten benchmark datasets. We find that Precision is consistently higher for data-rich users across all the datasets; Mean Average Precision is comparable across user groups but its variance is large; Recall yields a counter-intuitive result where the algorithm performs better for data-poor than for data-rich users, which bias is further exacerbated when negative item sampling is employed during evaluation. The final observation suggests that as users interact more with recommender systems, the quality of recommendations they receive degrades (when measured by Recall). Our insights clearly show the importance of an evaluation protocol and its influence on the reported results when studying recommender systems.

Viaarxiv icon

MIMICS-Duo: Offline & Online Evaluation of Search Clarification

Jun 09, 2022
Leila Tavakoli, Johanne R. Trippas, Hamed Zamani, Falk Scholer, Mark Sanderson

Figure 1 for MIMICS-Duo: Offline & Online Evaluation of Search Clarification
Figure 2 for MIMICS-Duo: Offline & Online Evaluation of Search Clarification
Figure 3 for MIMICS-Duo: Offline & Online Evaluation of Search Clarification
Figure 4 for MIMICS-Duo: Offline & Online Evaluation of Search Clarification

Asking clarification questions is an active area of research; however, resources for training and evaluating search clarification methods are not sufficient. To address this issue, we describe MIMICS-Duo, a new freely available dataset of 306 search queries with multiple clarifications (a total of 1,034 query-clarification pairs). MIMICS-Duo contains fine-grained annotations on clarification questions and their candidate answers and enhances the existing MIMICS datasets by enabling multi-dimensional evaluation of search clarification methods, including online and offline evaluation. We conduct extensive analysis to demonstrate the relationship between offline and online search clarification datasets and outline several research directions enabled by MIMICS-Duo. We believe that this resource will help researchers better understand clarification in search.

* 11 pages 
Viaarxiv icon

Joint Modelling of Cyber Activities and Physical Context to Improve Prediction of Visitor Behaviors

Aug 26, 2020
Manpreet Kaur, Flora D. Salim, Yongli Ren, Jeffrey Chan, Martin Tomko, Mark Sanderson

Figure 1 for Joint Modelling of Cyber Activities and Physical Context to Improve Prediction of Visitor Behaviors
Figure 2 for Joint Modelling of Cyber Activities and Physical Context to Improve Prediction of Visitor Behaviors
Figure 3 for Joint Modelling of Cyber Activities and Physical Context to Improve Prediction of Visitor Behaviors
Figure 4 for Joint Modelling of Cyber Activities and Physical Context to Improve Prediction of Visitor Behaviors

This paper investigates the Cyber-Physical behavior of users in a large indoor shopping mall by leveraging anonymized (opt in) Wi-Fi association and browsing logs recorded by the mall operators. Our analysis shows that many users exhibit a high correlation between their cyber activities and their physical context. To find this correlation, we propose a mechanism to semantically label a physical space with rich categorical information from DBPedia concepts and compute a contextual similarity that represents a user's activities with the mall context. We demonstrate the application of cyber-physical contextual similarity in two situations: user visit intent classification and future location prediction. The experimental results demonstrate that exploitation of contextual similarity significantly improves the accuracy of such applications.

* ACM Transactions on Sensor Networks, 2020  
* Accepted in ACM Transactions on Sensor Networks, 2020 
Viaarxiv icon

Conversational Search -- A Report from Dagstuhl Seminar 19461

May 18, 2020
Avishek Anand, Lawrence Cavedon, Matthias Hagen, Hideo Joho, Mark Sanderson, Benno Stein

Figure 1 for Conversational Search -- A Report from Dagstuhl Seminar 19461
Figure 2 for Conversational Search -- A Report from Dagstuhl Seminar 19461
Figure 3 for Conversational Search -- A Report from Dagstuhl Seminar 19461
Figure 4 for Conversational Search -- A Report from Dagstuhl Seminar 19461

Dagstuhl Seminar 19461 "Conversational Search" was held on 10-15 November 2019. 44~researchers in Information Retrieval and Web Search, Natural Language Processing, Human Computer Interaction, and Dialogue Systems were invited to share the latest development in the area of Conversational Search and discuss its research agenda and future directions. A 5-day program of the seminar consisted of six introductory and background sessions, three visionary talk sessions, one industry talk session, and seven working groups and reporting sessions. The seminar also had three social events during the program. This report provides the executive summary, overview of invited talks, and findings from the seven working groups which cover the definition, evaluation, modelling, explanation, scenarios, applications, and prototype of Conversational Search. The ideas and findings presented in this report should serve as one of the main sources for diverse research programs on Conversational Search.

* contains arXiv:2001.06910, arXiv:2001.02912 
Viaarxiv icon