Get our free extension to see links to code for papers anywhere online!

Chrome logo  Add to Chrome

Firefox logo Add to Firefox

"Recommendation": models, code, and papers

Attention-based Dynamic Subspace Learners for Medical Image Analysis

Jun 18, 2022
Sukesh Adiga V, Jose Dolz, Herve Lombaert

Learning similarity is a key aspect in medical image analysis, particularly in recommendation systems or in uncovering the interpretation of anatomical data in images. Most existing methods learn such similarities in the embedding space over image sets using a single metric learner. Images, however, have a variety of object attributes such as color, shape, or artifacts. Encoding such attributes using a single metric learner is inadequate and may fail to generalize. Instead, multiple learners could focus on separate aspects of these attributes in subspaces of an overarching embedding. This, however, implies the number of learners to be found empirically for each new dataset. This work, Dynamic Subspace Learners, proposes to dynamically exploit multiple learners by removing the need of knowing apriori the number of learners and aggregating new subspace learners during training. Furthermore, the visual interpretability of such subspace learning is enforced by integrating an attention module into our method. This integrated attention mechanism provides a visual insight of discriminative image features that contribute to the clustering of image sets and a visual explanation of the embedding features. The benefits of our attention-based dynamic subspace learners are evaluated in the application of image clustering, image retrieval, and weakly supervised segmentation. Our method achieves competitive results with the performances of multiple learners baselines and significantly outperforms the classification network in terms of clustering and retrieval scores on three different public benchmark datasets. Moreover, our attention maps offer a proxy-labels, which improves the segmentation accuracy up to 15% in Dice scores when compared to state-of-the-art interpretation techniques.

  
Access Paper or Ask Questions

A Review of Machine Learning Methods Applied to Structural Dynamics and Vibroacoustic

Apr 13, 2022
Barbara Cunha, Christophe Droz, Abdelmalek Zine, Stéphane Foulard, Mohamed Ichchou

The use of Machine Learning (ML) has rapidly spread across several fields, having encountered many applications in Structural Dynamics and Vibroacoustic (SD\&V). The increasing capabilities of ML to unveil insights from data, driven by unprecedented data availability, algorithms advances and computational power, enhance decision making, uncertainty handling, patterns recognition and real-time assessments. Three main applications in SD\&V have taken advantage of these benefits. In Structural Health Monitoring, ML detection and prognosis lead to safe operation and optimized maintenance schedules. System identification and control design are leveraged by ML techniques in Active Noise Control and Active Vibration Control. Finally, the so-called ML-based surrogate models provide fast alternatives to costly simulations, enabling robust and optimized product design. Despite the many works in the area, they have not been reviewed and analyzed. Therefore, to keep track and understand this ongoing integration of fields, this paper presents a survey of ML applications in SD\&V analyses, shedding light on the current state of implementation and emerging opportunities. The main methodologies, advantages, limitations, and recommendations based on scientific knowledge were identified for each of the three applications. Moreover, the paper considers the role of Digital Twins and Physics Guided ML to overcome current challenges and power future research progress. As a result, the survey provides a broad overview of the present landscape of ML applied in SD\&V and guides the reader to an advanced understanding of progress and prospects in the field.

  
Access Paper or Ask Questions

Role of Data Augmentation Strategies in Knowledge Distillation for Wearable Sensor Data

Jan 01, 2022
Eun Som Jeon, Anirudh Som, Ankita Shukla, Kristina Hasanaj, Matthew P. Buman, Pavan Turaga

Deep neural networks are parametrized by several thousands or millions of parameters, and have shown tremendous success in many classification problems. However, the large number of parameters makes it difficult to integrate these models into edge devices such as smartphones and wearable devices. To address this problem, knowledge distillation (KD) has been widely employed, that uses a pre-trained high capacity network to train a much smaller network, suitable for edge devices. In this paper, for the first time, we study the applicability and challenges of using KD for time-series data for wearable devices. Successful application of KD requires specific choices of data augmentation methods during training. However, it is not yet known if there exists a coherent strategy for choosing an augmentation approach during KD. In this paper, we report the results of a detailed study that compares and contrasts various common choices and some hybrid data augmentation strategies in KD based human activity analysis. Research in this area is often limited as there are not many comprehensive databases available in the public domain from wearable devices. Our study considers databases from small scale publicly available to one derived from a large scale interventional study into human activity and sedentary behavior. We find that the choice of data augmentation techniques during KD have a variable level of impact on end performance, and find that the optimal network choice as well as data augmentation strategies are specific to a dataset at hand. However, we also conclude with a general set of recommendations that can provide a strong baseline performance across databases.

  
Access Paper or Ask Questions

Learning Optimal and Fair Decision Trees for Non-Discriminative Decision-Making

Mar 25, 2019
Sina Aghaei, Mohammad Javad Azizi, Phebe Vayanos

In recent years, automated data-driven decision-making systems have enjoyed a tremendous success in a variety of fields (e.g., to make product recommendations, or to guide the production of entertainment). More recently, these algorithms are increasingly being used to assist socially sensitive decision-making (e.g., to decide who to admit into a degree program or to prioritize individuals for public housing). Yet, these automated tools may result in discriminative decision-making in the sense that they may treat individuals unfairly or unequally based on membership to a category or a minority, resulting in disparate treatment or disparate impact and violating both moral and ethical standards. This may happen when the training dataset is itself biased (e.g., if individuals belonging to a particular group have historically been discriminated upon). However, it may also happen when the training dataset is unbiased, if the errors made by the system affect individuals belonging to a category or minority differently (e.g., if misclassification rates for Blacks are higher than for Whites). In this paper, we unify the definitions of unfairness across classification and regression. We propose a versatile mixed-integer optimization framework for learning optimal and fair decision trees and variants thereof to prevent disparate treatment and/or disparate impact as appropriate. This translates to a flexible schema for designing fair and interpretable policies suitable for socially sensitive decision-making. We conduct extensive computational studies that show that our framework improves the state-of-the-art in the field (which typically relies on heuristics) to yield non-discriminative decisions at lower cost to overall accuracy.

* 33rd AAAI Conference on Artificial Intelligence, 2019 
  
Access Paper or Ask Questions

MAVE: A Product Dataset for Multi-source Attribute Value Extraction

Dec 16, 2021
Li Yang, Qifan Wang, Zac Yu, Anand Kulkarni, Sumit Sanghai, Bin Shu, Jon Elsas, Bhargav Kanagal

Attribute value extraction refers to the task of identifying values of an attribute of interest from product information. Product attribute values are essential in many e-commerce scenarios, such as customer service robots, product ranking, retrieval and recommendations. While in the real world, the attribute values of a product are usually incomplete and vary over time, which greatly hinders the practical applications. In this paper, we introduce MAVE, a new dataset to better facilitate research on product attribute value extraction. MAVE is composed of a curated set of 2.2 million products from Amazon pages, with 3 million attribute-value annotations across 1257 unique categories. MAVE has four main and unique advantages: First, MAVE is the largest product attribute value extraction dataset by the number of attribute-value examples. Second, MAVE includes multi-source representations from the product, which captures the full product information with high attribute coverage. Third, MAVE represents a more diverse set of attributes and values relative to what previous datasets cover. Lastly, MAVE provides a very challenging zero-shot test set, as we empirically illustrate in the experiments. We further propose a novel approach that effectively extracts the attribute value from the multi-source product information. We conduct extensive experiments with several baselines and show that MAVE is an effective dataset for attribute value extraction task. It is also a very challenging task on zero-shot attribute extraction. Data is available at {\it \url{https://github.com/google-research-datasets/MAVE}}.

* 10 pages, 7 figures. Accepted to WSDM 2022. Dataset available at https://github.com/google-research-datasets/MAVE 
  
Access Paper or Ask Questions

Profiling US Restaurants from Billions of Payment Card Transactions

Sep 05, 2020
Himel Dev, Hossein Hamooni

A payment card (such as debit or credit) is one of the most convenient payment methods for purchasing goods and services. Hundreds of millions of card transactions take place across the globe every day, generating a massive volume of transaction data. The data render a holistic view of cardholder-merchant interactions, containing insights that can benefit various applications, such as payment fraud detection and merchant recommendation. However, utilizing these insights often requires additional information about merchants missing from the data owner's (i.e., payment company's) perspective. For example, payment companies do not know the exact type of product a merchant serves. Collecting merchant attributes from external sources for commercial purposes can be expensive. Motivated by this limitation, we aim to infer latent merchant attributes from transaction data. As proof of concept, we concentrate on restaurants and infer the cuisine types of restaurants from transactions. To this end, we present a framework for inferring the cuisine types of restaurants from transaction data. Our proposed framework consists of three steps. In the first step, we generate cuisine labels for a limited number of restaurants via weak supervision. In the second step, we extract a wide variety of statistical features and neural embeddings from the restaurant transactions. In the third step, we use deep neural networks (DNNs) to infer the remaining restaurants' cuisine types. The proposed framework achieved a 76.2% accuracy in classifying the US restaurants. To the best of our knowledge, this is the first framework to infer the cuisine types of restaurants by analyzing transaction data as the only source.

* The 7th IEEE International Conference on Data Science and Advanced Analytics (DSAA), 2020 
  
Access Paper or Ask Questions

Inductive Relational Matrix Completion

Jul 09, 2020
Qitian Wu, Hengrui Zhang, Hongyuan Zha

Data sparsity and cold-start issues emerge as two major bottlenecks for matrix completion in the context of user-item interaction matrix. We propose a novel method that can fundamentally address these issues. The main idea is to partition users into support users, which have many observed interactions (i.e., non-zero entries in the matrix), and query users, which have few observed entries. For support users, we learn their transductive preference embeddings using matrix factorization over their interactions (a relatively dense sub-matrix). For query users, we devise an inductive relational model that learns to estimate the underlying relations between the two groups of users. This allows us to attentively aggregate the preference embeddings of support users in order to compute inductive embeddings for query users. This new method can address the data sparsity issue by generalizing the behavior patterns of warm-start users to others and thus enables the model to also work effectively for cold-start users with no historical interaction. As theoretical insights, we show that a general version of our model does not sacrifice any expressive power on query users compared with transductive matrix factorization under mild conditions. Also, the generalization error on query users is bounded by the numbers of support users and query users' observed interactions. Moreover, extensive experiments on real-world datasets demonstrate that our model outperforms several state-of-the-art methods by achieving significant improvements on MAE and AUC for warm-start, few-shot (sparsity) and zero-shot (cold-start) recommendation.

  
Access Paper or Ask Questions

One-way Explainability Isn't The Message

May 05, 2022
Ashwin Srinivasan, Michael Bain, Enrico Coiera

Recent engineering developments in specialised computational hardware, data-acquisition and storage technology have seen the emergence of Machine Learning (ML) as a powerful form of data analysis with widespread applicability beyond its historical roots in the design of autonomous agents. However -- possibly because of its origins in the development of agents capable of self-discovery -- relatively little attention has been paid to the interaction between people and ML. In this paper we are concerned with the use of ML in automated or semi-automated tools that assist one or more human decision makers. We argue that requirements on both human and machine in this context are significantly different to the use of ML either as part of autonomous agents for self-discovery or as part statistical data analysis. Our principal position is that the design of such human-machine systems should be driven by repeated, two-way intelligibility of information rather than one-way explainability of the ML-system's recommendations. Iterated rounds of intelligible information exchange, we think, will characterise the kinds of collaboration that will be needed to understand complex phenomena for which neither man or machine have complete answers. We propose operational principles -- we call them Intelligibility Axioms -- to guide the design of a collaborative decision-support system. The principles are concerned with: (a) what it means for information provided by the human to be intelligible to the ML system; and (b) what it means for an explanation provided by an ML system to be intelligible to a human. Using examples from the literature on the use of ML for drug-design and in medicine, we demonstrate cases where the conditions of the axioms are met. We describe some additional requirements needed for the design of a truly collaborative decision-support system.

* (22 pages. Submitted for review as a Perspectives paper to Nature Machine Intelligence) 
  
Access Paper or Ask Questions

Multi-Interactive Attention Network for Fine-grained Feature Learning in CTR Prediction

Dec 13, 2020
Kai Zhang, Hao Qian, Qing Cui, Qi Liu, Longfei Li, Jun Zhou, Jianhui Ma, Enhong Chen

In the Click-Through Rate (CTR) prediction scenario, user's sequential behaviors are well utilized to capture the user interest in the recent literature. However, despite being extensively studied, these sequential methods still suffer from three limitations. First, existing methods mostly utilize attention on the behavior of users, which is not always suitable for CTR prediction, because users often click on new products that are irrelevant to any historical behaviors. Second, in the real scenario, there exist numerous users that have operations a long time ago, but turn relatively inactive in recent times. Thus, it is hard to precisely capture user's current preferences through early behaviors. Third, multiple representations of user's historical behaviors in different feature subspaces are largely ignored. To remedy these issues, we propose a Multi-Interactive Attention Network (MIAN) to comprehensively extract the latent relationship among all kinds of fine-grained features (e.g., gender, age and occupation in user-profile). Specifically, MIAN contains a Multi-Interactive Layer (MIL) that integrates three local interaction modules to capture multiple representations of user preference through sequential behaviors and simultaneously utilize the fine-grained user-specific as well as context information. In addition, we design a Global Interaction Module (GIM) to learn the high-order interactions and balance the different impacts of multiple features. Finally, Offline experiment results from three datasets, together with an Online A/B test in a large-scale recommendation system, demonstrate the effectiveness of our proposed approach.

* 9 pages, 6 figures, WSDM2021, accepted 
  
Access Paper or Ask Questions

Pair-view Unsupervised Graph Representation Learning

Dec 11, 2020
You Li, Binli Luo, Ning Gui

Low-dimension graph embeddings have proved extremely useful in various downstream tasks in large graphs, e.g., link-related content recommendation and node classification tasks, etc. Most existing embedding approaches take nodes as the basic unit for information aggregation, e.g., node perception fields in GNN or con-textual nodes in random walks. The main drawback raised by such node-view is its lack of support for expressing the compound relationships between nodes, which results in the loss of a certain degree of graph information during embedding. To this end, this paper pro-poses PairE(Pair Embedding), a solution to use "pair", a higher level unit than a "node" as the core for graph embeddings. Accordingly, a multi-self-supervised auto-encoder is designed to fulfill two pretext tasks, to reconstruct the feature distribution for respective pairs and their surrounding context. PairE has three major advantages: 1) Informative, embedding beyond node-view are capable to preserve richer information of the graph; 2) Simple, the solutions provided by PairE are time-saving, storage-efficient, and require the fewer hyper-parameters; 3) High adaptability, with the introduced translator operator to map pair embeddings to the node embeddings, PairE can be effectively used in both the link-based and the node-based graph analysis. Experiment results show that PairE consistently outperforms the state of baselines in all four downstream tasks, especially with significant edges in the link-prediction and multi-label node classification tasks.

* 9 pages, 3 figures and 4 tables 
  
Access Paper or Ask Questions
<<
>>