Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Hybrid Collaborative Filtering with Autoencoders

Jul 19, 2016
Florian Strub, Jeremie Mary, Romaric Gaudel

Collaborative Filtering aims at exploiting the feedback of users to provide personalised recommendations. Such algorithms look for latent variables in a large sparse matrix of ratings. They can be enhanced by adding side information to tackle the well-known cold start problem. While Neu-ral Networks have tremendous success in image and speech recognition, they have received less attention in Collaborative Filtering. This is all the more surprising that Neural Networks are able to discover latent variables in large and heterogeneous datasets. In this paper, we introduce a Collaborative Filtering Neural network architecture aka CFN which computes a non-linear Matrix Factorization from sparse rating inputs and side information. We show experimentally on the MovieLens and Douban dataset that CFN outper-forms the state of the art and benefits from side information. We provide an implementation of the algorithm as a reusable plugin for Torch, a popular Neural Network framework.

  Access Paper or Ask Questions

Learning Multi-modal Similarity

Aug 30, 2010
Brian McFee, Gert Lanckriet

In many applications involving multi-media data, the definition of similarity between items is integral to several key tasks, e.g., nearest-neighbor retrieval, classification, and recommendation. Data in such regimes typically exhibits multiple modalities, such as acoustic and visual content of video. Integrating such heterogeneous data to form a holistic similarity space is therefore a key challenge to be overcome in many real-world applications. We present a novel multiple kernel learning technique for integrating heterogeneous data into a single, unified similarity space. Our algorithm learns an optimal ensemble of kernel transfor- mations which conform to measurements of human perceptual similarity, as expressed by relative comparisons. To cope with the ubiquitous problems of subjectivity and inconsistency in multi- media similarity, we develop graph-based techniques to filter similarity measurements, resulting in a simplified and robust training procedure.

  Access Paper or Ask Questions

Balancing Fairness and Accuracy in Sentiment Detection using Multiple Black Box Models

Apr 22, 2022
Abdulaziz A. Almuzaini, Vivek K. Singh

Sentiment detection is an important building block for multiple information retrieval tasks such as product recommendation, cyberbullying detection, and misinformation detection. Unsurprisingly, multiple commercial APIs, each with different levels of accuracy and fairness, are now available for sentiment detection. While combining inputs from multiple modalities or black-box models for increasing accuracy is commonly studied in multimedia computing literature, there has been little work on combining different modalities for increasing fairness of the resulting decision. In this work, we audit multiple commercial sentiment detection APIs for the gender bias in two actor news headlines settings and report on the level of bias observed. Next, we propose a "Flexible Fair Regression" approach, which ensures satisfactory accuracy and fairness by jointly learning from multiple black-box models. The results pave way for fair yet accurate sentiment detectors for multiple applications.

  Access Paper or Ask Questions

Eliciting Best Practices for Collaboration with Computational Notebooks

Feb 15, 2022
Luigi Quaranta, Fabio Calefato, Filippo Lanubile

Despite the widespread adoption of computational notebooks, little is known about best practices for their usage in collaborative contexts. In this paper, we fill this gap by eliciting a catalog of best practices for collaborative data science with computational notebooks. With this aim, we first look for best practices through a multivocal literature review. Then, we conduct interviews with professional data scientists to assess their awareness of these best practices. Finally, we assess the adoption of best practices through the analysis of 1,380 Jupyter notebooks retrieved from the Kaggle platform. Findings reveal that experts are mostly aware of the best practices and tend to adopt them in their daily work. Nonetheless, they do not consistently follow all the recommendations as, depending on specific contexts, some are deemed unfeasible or counterproductive due to the lack of proper tool support. As such, we envision the design of notebook solutions that allow data scientists not to have to prioritize exploration and rapid prototyping over writing code of quality.

* Proc. ACM Hum.-Comput. Interact., Vol. 6, No. CSCW1, Article 87, April 2022 

  Access Paper or Ask Questions

Cybertrust: From Explainable to Actionable and Interpretable AI (AI2)

Jan 26, 2022
Stephanie Galaitsi, Benjamin D. Trump, Jeffrey M. Keisler, Igor Linkov, Alexander Kott

To benefit from AI advances, users and operators of AI systems must have reason to trust it. Trust arises from multiple interactions, where predictable and desirable behavior is reinforced over time. Providing the system's users with some understanding of AI operations can support predictability, but forcing AI to explain itself risks constraining AI capabilities to only those reconcilable with human cognition. We argue that AI systems should be designed with features that build trust by bringing decision-analytic perspectives and formal tools into AI. Instead of trying to achieve explainable AI, we should develop interpretable and actionable AI. Actionable and Interpretable AI (AI2) will incorporate explicit quantifications and visualizations of user confidence in AI recommendations. In doing so, it will allow examining and testing of AI system predictions to establish a basis for trust in the systems' decision making and ensure broad benefits from deploying and advancing its computational capabilities.

  Access Paper or Ask Questions

Distributed Machine Learning and the Semblance of Trust

Dec 21, 2021
Dmitrii Usynin, Alexander Ziller, Daniel Rueckert, Jonathan Passerat-Palmbach, Georgios Kaissis

The utilisation of large and diverse datasets for machine learning (ML) at scale is required to promote scientific insight into many meaningful problems. However, due to data governance regulations such as GDPR as well as ethical concerns, the aggregation of personal and sensitive data is problematic, which prompted the development of alternative strategies such as distributed ML (DML). Techniques such as Federated Learning (FL) allow the data owner to maintain data governance and perform model training locally without having to share their data. FL and related techniques are often described as privacy-preserving. We explain why this term is not appropriate and outline the risks associated with over-reliance on protocols that were not designed with formal definitions of privacy in mind. We further provide recommendations and examples on how such algorithms can be augmented to provide guarantees of governance, security, privacy and verifiability for a general ML audience without prior exposure to formal privacy techniques.

* Accepted at The Third AAAI Workshop on Privacy-Preserving Artificial Intelligence 

  Access Paper or Ask Questions

A low-cost wave-solar powered Unmanned Surface Vehicle

Dec 07, 2021
Moustafa Elkolali, Ahmed Al-Tawil, Lennard Much, Ryan Schrader, Olivier Masset, Marina Sayols, Andrew Jenkins, Sara Alonso, Alfredo Carella, Alex Alcocer

This paper presents a prototype of a low-cost Unmanned Surface Vehicle (USV) that is operated by wave and solar energy which can be used to minimize the cost of ocean data collection. The current prototype is a compact USV, with a length of 1.2m that can be deployed and recovered by two persons. The design includes an electrically operated winch that can be used to retract and lower the underwater unit. Several elements of the design make use of additive manufacturing and inexpensive materials. The vehicle can be controlled using radio frequency (RF) and a satellite communication, through a custom developed web application. Both the surface and underwater units were optimized with regard to drag, lift, weight, and price by using recommendation of previous research work and advanced materials. The USV could be used in water condition monitoring by measuring several parameters, such as dissolved oxygen, salinity, temperature, and pH.


  Access Paper or Ask Questions

Intelligent Decision Assistance Versus Automated Decision-Making: Enhancing Knowledge Work Through Explainable Artificial Intelligence

Sep 28, 2021
Max Schemmer, Niklas Kühl, Gerhard Satzger

While recent advances in AI-based automated decision-making have shown many benefits for businesses and society, they also come at a cost. It has for long been known that a high level of automation of decisions can lead to various drawbacks, such as automation bias and deskilling. In particular, the deskilling of knowledge workers is a major issue, as they are the same people who should also train, challenge and evolve AI. To address this issue, we conceptualize a new class of DSS, namely Intelligent Decision Assistance (IDA) based on a literature review of two different research streams -- DSS and automation. IDA supports knowledge workers without influencing them through automated decision-making. Specifically, we propose to use techniques of Explainable AI (XAI) while withholding concrete AI recommendations. To test this conceptualization, we develop hypotheses on the impacts of IDA and provide first evidence for their validity based on empirical studies in the literature.

* Hawaii International Conference on System Sciences 2022 (HICSS-55) 

  Access Paper or Ask Questions

Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Jun 23, 2020
Soumyasundar Pal, Saber Malekmohammadi, Florence Regol, Yingxue Zhang, Yishi Xu, Mark Coates

Graphs are ubiquitous in modelling relational structures. Recent endeavours in machine learning for graph-structured data have led to many architectures and learning algorithms. However, the graph used by these algorithms is often constructed based on inaccurate modelling assumptions and/or noisy data. As a result, it fails to represent the true relationships between nodes. A Bayesian framework which targets posterior inference of the graph by considering it as a random quantity can be beneficial. In this paper, we propose a novel non-parametric graph model for constructing the posterior distribution of graph adjacency matrices. The proposed model is flexible in the sense that it can effectively take into account the output of graph-based learning algorithms that target specific tasks. In addition, model inference scales well to large graphs. We demonstrate the advantages of this model in three different problem settings: node classification, link prediction and recommendation.

  Access Paper or Ask Questions

Identifying Statistical Bias in Dataset Replication

May 19, 2020
Logan Engstrom, Andrew Ilyas, Shibani Santurkar, Dimitris Tsipras, Jacob Steinhardt, Aleksander Madry

Dataset replication is a useful tool for assessing whether improvements in test accuracy on a specific benchmark correspond to improvements in models' ability to generalize reliably. In this work, we present unintuitive yet significant ways in which standard approaches to dataset replication introduce statistical bias, skewing the resulting observations. We study ImageNet-v2, a replication of the ImageNet dataset on which models exhibit a significant (11-14%) drop in accuracy, even after controlling for a standard human-in-the-loop measure of data quality. We show that after correcting for the identified statistical bias, only an estimated $3.6\% \pm 1.5\%$ of the original $11.7\% \pm 1.0\%$ accuracy drop remains unaccounted for. We conclude with concrete recommendations for recognizing and avoiding bias in dataset replication. Code for our study is publicly available at .

  Access Paper or Ask Questions