Out-of-distribution data and anomalous inputs are vulnerabilities of machine learning systems today, often causing systems to make incorrect predictions. The diverse range of data on which these models are used makes detecting atypical inputs a difficult and important task. We assess a tool, Benford's law, as a method used to quantify the difference between real and corrupted inputs. We believe that in many settings, it could function as a filter for anomalous data points and for signalling out-of-distribution data. We hope to open a discussion on these applications and further areas where this technique is underexplored.
Collaborative filtering (CF) has achieved great success in the field of recommender systems. In recent years, many novel CF models, particularly those based on deep learning or graph techniques, have been proposed for a variety of recommendation tasks, such as rating prediction and item ranking. These newly published models usually demonstrate their performance in comparison to baselines or existing models in terms of accuracy improvements. However, others have pointed out that many newly proposed models are not as strong as expected and are outperformed by very simple baselines. This paper proposes a simple linear model based on Matrix Factorization (MF), called UserReg, which regularizes users' latent representations with explicit feedback information for rating prediction. We compare the effectiveness of UserReg with three linear CF models that are widely-used as baselines, and with a set of recently proposed complex models that are based on deep learning or graph techniques. Experimental results show that UserReg achieves overall better performance than the fine-tuned baselines considered and is highly competitive when compared with other recently proposed models. We conclude that UserReg can be used as a strong baseline for future CF research.