Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product

Sep 15, 2020
Tiangang Zhu, Yue Wang, Haoran Li, Youzheng Wu, Xiaodong He, Bowen Zhou

Product attribute values are essential in many e-commerce scenarios, such as customer service robots, product recommendations, and product retrieval. While in the real world, the attribute values of a product are usually incomplete and vary over time, which greatly hinders the practical applications. In this paper, we propose a multimodal method to jointly predict product attributes and extract values from textual product descriptions with the help of the product images. We argue that product attributes and values are highly correlated, e.g., it will be easier to extract the values on condition that the product attributes are given. Thus, we jointly model the attribute prediction and value extraction tasks from multiple aspects towards the interactions between attributes and values. Moreover, product images have distinct effects on our tasks for different product attributes and values. Thus, we selectively draw useful visual information from product images to enhance our model. We annotate a multimodal product attribute value dataset that contains 87,194 instances, and the experimental results on this dataset demonstrate that explicitly modeling the relationship between attributes and values facilitates our method to establish the correspondence between them, and selectively utilizing visual product information is necessary for the task. Our code and dataset will be released to the public.

* Accepted by EMNLP 2020 

  Access Paper or Ask Questions

Grading video interviews with fairness considerations

Jul 02, 2020
Abhishek Singhania, Abhishek Unnam, Varun Aggarwal

There has been considerable interest in predicting human emotions and traits using facial images and videos. Lately, such work has come under criticism for poor labeling practices, inconclusive prediction results and fairness considerations. We present a careful methodology to automatically derive social skills of candidates based on their video response to interview questions. We, for the first time, include video data from multiple countries encompassing multiple ethnicities. Also, the videos were rated by individuals from multiple racial backgrounds, following several best practices, to achieve a consensus and unbiased measure of social skills. We develop two machine-learning models to predict social skills. The first model employs expert-guidance to use plausibly causal features. The second uses deep learning and depends solely on the empirical correlations present in the data. We compare errors of both these models, study the specificity of the models and make recommendations. We further analyze fairness by studying the errors of models by race and gender. We verify the usefulness of our models by determining how well they predict interview outcomes for candidates. Overall, the study provides strong support for using artificial intelligence for video interview scoring, while taking care of fairness and ethical considerations.

* Submitted to NeurIPS2020 

  Access Paper or Ask Questions

Multi-Objective Generalized Linear Bandits

May 30, 2019
Shiyin Lu, Guanghui Wang, Yao Hu, Lijun Zhang

In this paper, we study the multi-objective bandits (MOB) problem, where a learner repeatedly selects one arm to play and then receives a reward vector consisting of multiple objectives. MOB has found many real-world applications as varied as online recommendation and network routing. On the other hand, these applications typically contain contextual information that can guide the learning process which, however, is ignored by most of existing work. To utilize this information, we associate each arm with a context vector and assume the reward follows the generalized linear model (GLM). We adopt the notion of Pareto regret to evaluate the learner's performance and develop a novel algorithm for minimizing it. The essential idea is to apply a variant of the online Newton step to estimate model parameters, based on which we utilize the upper confidence bound (UCB) policy to construct an approximation of the Pareto front, and then uniformly at random choose one arm from the approximate Pareto front. Theoretical analysis shows that the proposed algorithm achieves an $\tilde O(d\sqrt{T})$ Pareto regret, where $T$ is the time horizon and $d$ is the dimension of contexts, which matches the optimal result for single objective contextual bandits problem. Numerical experiments demonstrate the effectiveness of our method.

  Access Paper or Ask Questions

Designing and Implementing Data Warehouse for Agricultural Big Data

May 29, 2019
Vuong M. Ngo, Nhien-An Le-Khac, M-Tahar Kechadi

In recent years, precision agriculture that uses modern information and communication technologies is becoming very popular. Raw and semi-processed agricultural data are usually collected through various sources, such as: Internet of Thing (IoT), sensors, satellites, weather stations, robots, farm equipment, farmers and agribusinesses, etc. Besides, agricultural datasets are very large, complex, unstructured, heterogeneous, non-standardized, and inconsistent. Hence, the agricultural data mining is considered as Big Data application in terms of volume, variety, velocity and veracity. It is a key foundation to establishing a crop intelligence platform, which will enable resource efficient agronomy decision making and recommendations. In this paper, we designed and implemented a continental level agricultural data warehouse by combining Hive, MongoDB and Cassandra. Our data warehouse capabilities: (1) flexible schema; (2) data integration from real agricultural multi datasets; (3) data science and business intelligent support; (4) high performance; (5) high storage; (6) security; (7) governance and monitoring; (8) replication and recovery; (9) consistency, availability and partition tolerant; (10) distributed and cloud deployment. We also evaluate the performance of our data warehouse.

* BigData 2019 
* Business intelligent, data warehouse, constellation schema, Big Data, precision agriculture 

  Access Paper or Ask Questions

Unsupervised Abbreviation Disambiguation Contextual disambiguation using word embeddings

Apr 01, 2019
Ciosici, Manuel, Sommer, Tobias, Assent, Ira

As abbreviations often have several distinct meanings, disambiguating their intended meaning in context is important for Machine Reading tasks such as document search, recommendation and question answering. Existing approaches mostly rely on labelled examples of abbreviations and their correct long forms, which is costly to generate and limits their applicability and flexibility. Importantly, they need to be subjected to a full empirical evaluation, which is cumbersome in practice. In this paper, we present an entirely unsupervised abbreviation disambiguation method (called UAD) that picks up abbreviation definitions from text. Creating distinct tokens per meaning, we learn context representations as word embeddings. We demonstrate how to further boost abbreviation disambiguation performance by obtaining better context representations from additional unstructured text. Our method is the first abbreviation disambiguation approach which features a transparent model that allows performance analysis without requiring full-scale evaluation, making it highly relevant for real-world deployments. In our thorough empirical evaluation, UAD achieves high performance on large real world document data sets from different domains and outperforms both baseline and state-of-the-art methods. UAD scales well and supports thousands of abbreviations with many different meanings with a single model.

  Access Paper or Ask Questions

Predictive Clinical Decision Support System with RNN Encoding and Tensor Decoding

Dec 02, 2016
Yinchong Yang, Peter A. Fasching, Markus Wallwiener, Tanja N. Fehm, Sara Y. Brucker, Volker Tresp

With the introduction of the Electric Health Records, large amounts of digital data become available for analysis and decision support. When physicians are prescribing treatments to a patient, they need to consider a large range of data variety and volume, making decisions increasingly complex. Machine learning based Clinical Decision Support systems can be a solution to the data challenges. In this work we focus on a class of decision support in which the physicians' decision is directly predicted. Concretely, the model would assign higher probabilities to decisions that it presumes the physician are more likely to make. Thus the CDS system can provide physicians with rational recommendations. We also address the problem of correlation in target features: Often a physician is required to make multiple (sub-)decisions in a block, and that these decisions are mutually dependent. We propose a solution to the target correlation problem using a tensor factorization model. In order to handle the patients' historical information as sequential data, we apply the so-called Encoder-Decoder-Framework which is based on Recurrent Neural Networks (RNN) as encoders and a tensor factorization model as a decoder, a combination which is novel in machine learning. With experiments with real-world datasets we show that the proposed model does achieve better prediction performances.

  Access Paper or Ask Questions

Distributed Online Learning via Cooperative Contextual Bandits

Mar 23, 2015
Cem Tekin, Mihaela van der Schaar

In this paper we propose a novel framework for decentralized, online learning by many learners. At each moment of time, an instance characterized by a certain context may arrive to each learner; based on the context, the learner can select one of its own actions (which gives a reward and provides information) or request assistance from another learner. In the latter case, the requester pays a cost and receives the reward but the provider learns the information. In our framework, learners are modeled as cooperative contextual bandits. Each learner seeks to maximize the expected reward from its arrivals, which involves trading off the reward received from its own actions, the information learned from its own actions, the reward received from the actions requested of others and the cost paid for these actions - taking into account what it has learned about the value of assistance from each other learner. We develop distributed online learning algorithms and provide analytic bounds to compare the efficiency of these with algorithms with the complete knowledge (oracle) benchmark (in which the expected reward of every action in every context is known by every learner). Our estimates show that regret - the loss incurred by the algorithm - is sublinear in time. Our theoretical framework can be used in many practical applications including Big Data mining, event detection in surveillance sensor networks and distributed online recommendation systems.

  Access Paper or Ask Questions

A False Sense of Security? Revisiting the State of Machine Learning-Based Industrial Intrusion Detection

May 18, 2022
Dominik Kus, Eric Wagner, Jan Pennekamp, Konrad Wolsing, Ina Berenice Fink, Markus Dahlmanns, Klaus Wehrle, Martin Henze

Anomaly-based intrusion detection promises to detect novel or unknown attacks on industrial control systems by modeling expected system behavior and raising corresponding alarms for any deviations.As manually creating these behavioral models is tedious and error-prone, research focuses on machine learning to train them automatically, achieving detection rates upwards of 99%. However, these approaches are typically trained not only on benign traffic but also on attacks and then evaluated against the same type of attack used for training. Hence, their actual, real-world performance on unknown (not trained on) attacks remains unclear. In turn, the reported near-perfect detection rates of machine learning-based intrusion detection might create a false sense of security. To assess this situation and clarify the real potential of machine learning-based industrial intrusion detection, we develop an evaluation methodology and examine multiple approaches from literature for their performance on unknown attacks (excluded from training). Our results highlight an ineffectiveness in detecting unknown attacks, with detection rates dropping to between 3.2% and 14.7% for some types of attacks. Moving forward, we derive recommendations for further research on machine learning-based approaches to ensure clarity on their ability to detect unknown attacks.

* ACM CPSS'22 

  Access Paper or Ask Questions

Effectively leveraging Multi-modal Features for Movie Genre Classification

Mar 24, 2022
Zhongping Zhang, Yiwen Gu, Bryan A. Plummer, Xin Miao, Jiayi Liu, Huayan Wang

Movie genre classification has been widely studied in recent years due to its various applications in video editing, summarization, and recommendation. Prior work has typically addressed this task by predicting genres based solely on the visual content. As a result, predictions from these methods often perform poorly for genres such as documentary or musical, since non-visual modalities like audio or language play an important role in correctly classifying these genres. In addition, the analysis of long videos at frame level is always associated with high computational cost and makes the prediction less efficient. To address these two issues, we propose a Multi-Modal approach leveraging shot information, MMShot, to classify video genres in an efficient and effective way. We evaluate our method on MovieNet and Condensed Movies for genre classification, achieving 17% ~ 21% improvement on mean Average Precision (mAP) over the state-of-the-art. Extensive experiments are conducted to demonstrate the ability of MMShot for long video analysis and uncover the correlations between genres and multiple movie elements. We also demonstrate our approach's ability to generalize by evaluating the scene boundary detection task, achieving 1.1% improvement on Average Precision (AP) over the state-of-the-art.

  Access Paper or Ask Questions

Don't Get Me Wrong: How to apply Deep Visual Interpretations to Time Series

Mar 14, 2022
Christoffer Loeffler, Wei-Cheng Lai, Bjoern Eskofier, Dario Zanca, Lukas Schmidt, Christopher Mutschler

The correct interpretation and understanding of deep learning models is essential in many applications. Explanatory visual interpretation approaches for image and natural language processing allow domain experts to validate and understand almost any deep learning model. However, they fall short when generalizing to arbitrary time series data that is less intuitive and more diverse. Whether a visualization explains the true reasoning or captures the real features is difficult to judge. Hence, instead of blind trust we need an objective evaluation to obtain reliable quality metrics. We propose a framework of six orthogonal metrics for gradient- or perturbation-based post-hoc visual interpretation methods designed for time series classification and segmentation tasks. An experimental study includes popular neural network architectures for time series and nine visual interpretation methods. We evaluate the visual interpretation methods with diverse datasets from the UCR repository and a complex real-world dataset, and study the influence of common regularization techniques during training. We show that none of the methods consistently outperforms any of the others on all metrics while some are ahead at times. Our insights and recommendations allow experts to make informed choices of suitable visualization techniques for the model and task at hand.

* 32 pages, 13 figues 

  Access Paper or Ask Questions