Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

"Are you sure?": Preliminary Insights from Scaling Product Comparisons to Multiple Shops

Jul 08, 2021
Patrick John Chia, Bingqing Yu, Jacopo Tagliabue

Large eCommerce players introduced comparison tables as a new type of recommendations. However, building comparisons at scale without pre-existing training/taxonomy data remains an open challenge, especially within the operational constraints of shops in the long tail. We present preliminary results from building a comparison pipeline designed to scale in a multi-shop scenario: we describe our design choices and run extensive benchmarks on multiple shops to stress-test it. Finally, we run a small user study on property selection and conclude by discussing potential improvements and highlighting the questions that remain to be addressed.

* Accepted for publication at SIGIR eCom 2021 

  Access Paper or Ask Questions

Analysis of draft EU ADS Performance Requirements

Jun 03, 2021
Maria Soledad Elli, Jack Weast

Recently, the European Commission published draft regulation for uniform procedures and technical specification for the type-approval of motor vehicles with an automated driving system (ADS). While the draft regulation is welcome progress for an industry ready to deploy life saving automated vehicle technology, we believe that the requirements can be further improved to enhance the safety and societal acceptance of automated vehicles (AVs). In this paper, we evaluate the draft regulation's performance requirements that would impact the Dynamic Driving Task (DDT). We highlight potential problems that can arise from the current proposed requirements and propose practical recommendations to improve the regulation.

  Access Paper or Ask Questions

Pitfalls in Machine Learning Research: Reexamining the Development Cycle

Nov 04, 2020
Stella Biderman, Walter J. Scheirer

Machine learning has the potential to fuel further advances in data science, but it is greatly hindered by an ad hoc design process, poor data hygiene, and a lack of statistical rigor in model evaluation. Recently, these issues have begun to attract more attention as they have caused public and embarrassing issues in research and development. Drawing from our experience as machine learning researchers, we follow the machine learning process from algorithm design to data collection to model evaluation, drawing attention to common pitfalls and providing practical recommendations for improvements. At each step, case studies are introduced to highlight how these pitfalls occur in practice, and where things could be improved.

* NeurIPS "I Can't Believe It's Not Better!" Workshop 

  Access Paper or Ask Questions

The GeoLifeCLEF 2020 Dataset

Apr 08, 2020
Elijah Cole, Benjamin Deneu, Titouan Lorieul, Maximilien Servajean, Christophe Botella, Dan Morris, Nebojsa Jojic, Pierre Bonnet, Alexis Joly

Understanding the geographic distribution of species is a key concern in conservation. By pairing species occurrences with environmental features, researchers can model the relationship between an environment and the species which may be found there. To facilitate research in this area, we present the GeoLifeCLEF 2020 dataset, which consists of 1.9 million species observations paired with high-resolution remote sensing imagery, land cover data, and altitude, in addition to traditional low-resolution climate and soil variables. We also discuss the GeoLifeCLEF 2020 competition, which aims to use this dataset to advance the state-of-the-art in location-based species recommendation.

* 10 pages, 4 figures 

  Access Paper or Ask Questions

Learning from Binary Multiway Data: Probabilistic Tensor Decomposition and its Statistical Optimality

Nov 13, 2018
Miaoyan Wang, Lexin Li

We consider the problem of decomposition of multiway tensor with binary entries. Such data problems arise frequently in numerous applications such as neuroimaging, recommendation system, topic modeling, and sensor network localization. We propose that the observed binary entries follow a Bernoulli model, develop a rank-constrained likelihood-based estimation procedure, and obtain the theoretical accuracy guarantees. Specifically, we establish the error bound of the tensor estimation, and show that the obtained rate is minimax optimal under the considered model. We demonstrate the efficacy of our approach through both simulations and analyses of multiple real-world datasets on the tasks of tensor completion and clustering.

* 22 pages, 5 figures 

  Access Paper or Ask Questions

Quantum-inspired classical algorithms for principal component analysis and supervised clustering

Oct 31, 2018
Ewin Tang

We describe classical analogues to quantum algorithms for principal component analysis and nearest-centroid clustering. Given sampling assumptions, our classical algorithms run in time polylogarithmic in input, matching the runtime of the quantum algorithms with only polynomial slowdown. These algorithms are evidence that their corresponding problems do not yield exponential quantum speedups. To build our classical algorithms, we use the same techniques as applied in our previous work dequantizing a quantum recommendation systems algorithm. Thus, we provide further evidence for the strength of classical $\ell^2$-norm sampling assumptions when replacing quantum state preparation assumptions, in the machine learning domain.

* 5 pages 

  Access Paper or Ask Questions

A new subband non linear prediction coding algorithm for narrowband speech signal: The nADPCMB MLT coding scheme

Mar 24, 2022
Guido D'Alessandro, Marcos Faundez Zanuy, Francesco Piazza

This paper focuses on a newly developed transparent nADPCMB MLT speech coding algorithm. Our coder first decomposes the narrowband speech signal in subbands, a non linear ADPCM scheme is then performed in each subband. The signal subband decomposition is piloted by the equivalent Modulated Lapped Transform (MLT) filter bank. The novelty of this algorithm is the non linear approach, based on neural networks, to subband prediction coding. We have evaluated the performance of the nADPCMB MLT coding algorithm with a session of formal listening based on the five grade impairment scale standardized within ITU - T Recommendation P.800.

* 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2002, pp. I-1025-I-1028 
* 4 pages, published in 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing Orlando, FL, USA 

  Access Paper or Ask Questions

Forget me not: A Gentle Reminder to Mind the Simple Multi-Layer Perceptron Baseline for Text Classification

Sep 23, 2021
Lukas Galke, Ansgar Scherp

Graph neural networks have triggered a resurgence of graph-based text classification. We show that already a simple MLP baseline achieves comparable performance on benchmark datasets, questioning the importance of synthetic graph structures. When considering an inductive scenario, i. e., when adding new documents to a corpus, a simple MLP even outperforms the recent graph-based models TextGCN and HeteGCN and is comparable with HyperGAT. We further fine-tune DistilBERT and find that it outperforms all state-of-the-art models. We suggest that future studies use at least an MLP baseline to contextualize the results. We provide recommendations for the design and training of such a baseline.

* 5 pages, added link to code 

  Access Paper or Ask Questions

RaspberryPI for mosquito neutralization by power laser

May 20, 2021
R. Ildar

In this article for the first time, comprehensive studies of mosquito neutralization using machine vision and a 1 W power laser are considered. Developed laser installation with Raspberry Pi that changing the direction of the laser with a galvanometer. We developed a program for mosquito tracking in real. The possibility of using deep neural networks, Haar cascades, machine learning for mosquito recognition was considered. We considered in detail the classification problems of mosquitoes in images. A recommendation is given for the implementation of this device based on a microcontroller for subsequent use as part of an unmanned aerial vehicle. Any harmful insects in the fields can be used as objects for control.

  Access Paper or Ask Questions

Relational Boosted Bandits

Dec 16, 2020
Ashutosh Kakadiya, Sriraam Natarajan, Balaraman Ravindran

Contextual bandits algorithms have become essential in real-world user interaction problems in recent years. However, these algorithms rely on context as attribute value representation, which makes them unfeasible for real-world domains like social networks are inherently relational. We propose Relational Boosted Bandits(RB2), acontextual bandits algorithm for relational domains based on (relational) boosted trees. RB2 enables us to learn interpretable and explainable models due to the more descriptive nature of the relational representation. We empirically demonstrate the effectiveness and interpretability of RB2 on tasks such as link prediction, relational classification, and recommendations.

* 8 pages, 3 figures 

  Access Paper or Ask Questions