Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

A Hierarchical Conditional Random Field-based Attention Mechanism Approach for Gastric Histopathology Image Classification

Feb 21, 2021
Yixin Li, Xinran Wu, Chen Li, Changhao Sun, Md Rahaman, Yudong Yao, Xiaoyan Li, Yong Zhang, Tao Jiang

In the Gastric Histopathology Image Classification (GHIC) tasks, which is usually weakly supervised learning missions, there is inevitably redundant information in the images. Therefore, designing networks that can focus on effective distinguishing features has become a popular research topic. In this paper, to accomplish the tasks of GHIC superiorly and to assist pathologists in clinical diagnosis, an intelligent Hierarchical Conditional Random Field based Attention Mechanism (HCRF-AM) model is proposed. The HCRF-AM model consists of an Attention Mechanism (AM) module and an Image Classification (IC) module. In the AM module, an HCRF model is built to extract attention regions. In the IC module, a Convolutional Neural Network (CNN) model is trained with the attention regions selected and then an algorithm called Classification Probability-based Ensemble Learning is applied to obtain the image-level results from patch-level output of the CNN. In the experiment, a classification specificity of 96.67% is achieved on a gastric histopathology dataset with 700 images. Our HCRF-AM model demonstrates high classification performance and shows its effectiveness and future potential in the GHIC field.


  Access Paper or Ask Questions

A Simple Deep Equilibrium Model Converges to Global Optima with Weight Tying

Feb 15, 2021
Kenji Kawaguchi

A deep equilibrium linear model is implicitly defined through an equilibrium point of an infinite sequence of computation. It avoids any explicit computation of the infinite sequence by finding an equilibrium point directly via root-finding and by computing gradients via implicit differentiation. It is a simple deep equilibrium model with nonlinear activations on weight matrices. In this paper, we analyze the gradient dynamics of this simple deep equilibrium model with non-convex objective functions for a general class of losses used in regression and classification. Despite non-convexity, convergence to global optimum at a linear rate is guaranteed without any assumption on the width of the models, allowing the width to be smaller than the output dimension and the number of data points. Moreover, we prove a relation between the gradient dynamics of the simple deep equilibrium model and the dynamics of trust region Newton method of a shallow model. This mathematically proven relation along with our numerical observation suggests the importance of understanding implicit bias and a possible open problem on the topic. Our proofs deal with nonlinearity and weight tying, and differ from those in the related literature.

* ICLR 2021. Selected for ICLR Spotlight (top 6% submissions) 

  Access Paper or Ask Questions

Leveraging Benchmarking Data for Informed One-Shot Dynamic Algorithm Selection

Feb 12, 2021
Furong Ye, Carola Doerr, Thomas Bäck

A key challenge in the application of evolutionary algorithms in practice is the selection of an algorithm instance that best suits the problem at hand. What complicates this decision further is that different algorithms may be best suited for different stages of the optimization process. Dynamic algorithm selection and configuration are therefore well-researched topics in evolutionary computation. However, while hyper-heuristics and parameter control studies typically assume a setting in which the algorithm needs to be chosen while running the algorithms, without prior information, AutoML approaches such as hyper-parameter tuning and automated algorithm configuration assume the possibility of evaluating different configurations before making a final recommendation. In practice, however, we are often in a middle-ground between these two settings, where we need to decide on the algorithm instance before the run ("oneshot" setting), but where we have (possibly lots of) data available on which we can base an informed decision. We analyze in this work how such prior performance data can be used to infer informed dynamic algorithm selection schemes for the solution of pseudo-Boolean optimization problems. Our specific use-case considers a family of genetic algorithms.

* Submitted for review to GECCO'21 

  Access Paper or Ask Questions

Modeling Complex Financial Products

Feb 03, 2021
Margret Bjarnadottir, Louiqa Raschid

The objective of this paper is to explore how financial big data and machine learning methods can be applied to model and understand complex financial products. We focus on residential mortgage backed securities, resMBS, that were at the heart of the 2008 US financial crisis. The securities are contained within a prospectus and have a complex payoff structure. Multiple financial institutions form a supply chain to create the prospectuses. We provide insight into the performance of the resMBS securities through a series of increasingly complex models. First, models at the security level directly identify salient features of resMBS securities that impact their performance. Second, we extend the model to include prospectus level features. We are the first to demonstrate that the composition of the prospectus is associated with the performance of securities. Finally, to develop a deeper understanding of the role of the supply chain, we use unsupervised probabilistic methods, in particular, dynamic topics models (DTM), to understand community formation and temporal evolution along the chain. A comprehensive model provides insight into the impact of DTM communities on the issuance and evolution of prospectuses, and eventually the performance of resMBS securities.


  Access Paper or Ask Questions

Open-Domain Conversational Search Assistant with Transformers

Jan 20, 2021
Rafael Ferreira, Mariana Leite, David Semedo, Joao Magalhaes

Open-domain conversational search assistants aim at answering user questions about open topics in a conversational manner. In this paper we show how the Transformer architecture achieves state-of-the-art results in key IR tasks, leveraging the creation of conversational assistants that engage in open-domain conversational search with single, yet informative, answers. In particular, we propose an open-domain abstractive conversational search agent pipeline to address two major challenges: first, conversation context-aware search and second, abstractive search-answers generation. To address the first challenge, the conversation context is modeled with a query rewriting method that unfolds the context of the conversation up to a specific moment to search for the correct answers. These answers are then passed to a Transformer-based re-ranker to further improve retrieval performance. The second challenge, is tackled with recent Abstractive Transformer architectures to generate a digest of the top most relevant passages. Experiments show that Transformers deliver a solid performance across all tasks in conversational search, outperforming the best TREC CAsT 2019 baseline.


  Access Paper or Ask Questions

Similarity Analysis of Self-Supervised Speech Representations

Oct 22, 2020
Yu-An Chung, Yonatan Belinkov, James Glass

Self-supervised speech representation learning has recently been a prosperous research topic. Many algorithms have been proposed for learning useful representations from large-scale unlabeled data, and their applications to a wide range of speech tasks have also been investigated. However, there has been little research focusing on understanding the properties of existing approaches. In this work, we aim to provide a comparative study of some of the most representative self-supervised algorithms. Specifically, we quantify the similarities between different self-supervised representations using existing similarity measures. We also design probing tasks to study the correlation between the models' pre-training loss and the amount of specific speech information contained in their learned representations. In addition to showing how various self-supervised models behave differently given the same input, our study also finds that the training objective has a higher impact on representation similarity than architectural choices such as building blocks (RNN/Transformer/CNN) and directionality (uni/bidirectional). Our results also suggest that there exists a strong correlation between pre-training loss and downstream performance for some self-supervised algorithms.


  Access Paper or Ask Questions

Joint coded aperture optimization and compressive hyperspectral image classification using 3D coded neural network

Oct 04, 2020
Hao Zhang

Hyperspectral image classification (HIC) is an active research topic in remote sensing. However, the huge volume of three-dimensional (3D) hyperspectral images poses big challenges in data acquisition, storage, transmission and processing. To overcome these limitations, this paper develops a novel deep learning HIC approach based on the compressive measurements of coded-aperture snapshot spectral imaging (CASSI) system, without reconstructing the complete hyperspectral data cube. A new kind of deep learning strategy, namely 3D coded convolutional neural network (3D-CCNN) is proposed to efficiently solve for the HIC problem, where the hardware-based coded aperture is regarded as a pixel-wise connected network layer. An end-to-end training method is developed to jointly optimize the network parameters and the coded aperture pattern with periodic structure. The accuracy of HIC approach is effectively improved by involving the degrees of optimization freedom from the coded aperture. The superiority of the proposed method is assessed on some public hyperspectral datasets over the state-of-the-art HIC methods.


  Access Paper or Ask Questions

Compressive Spectral Image Classification using 3D Coded Neural Network

Sep 23, 2020
Hao Zhang

Hyperspectral image classification (HIC) is an active research topic in remote sensing. However, the huge volume of three-dimensional (3D) hyperspectral images poses big challenges in data acquisition, storage, transmission and processing. To overcome these limitations, this paper develops a novel deep learning HIC approach based on the compressive measurements of coded-aperture snapshot spectral imaging (CASSI) system, without reconstructing the complete hyperspectral data cube. A new kind of deep learning strategy, namely 3D coded convolutional neural network (3D-CCNN) is proposed to efficiently solve for the HIC problem, where the hardware-based coded aperture is regarded as a pixel-wise connected network layer. An end-to-end training method is developed to jointly optimize the network parameters and the coded aperture pattern with periodic structure. The accuracy of HIC approach is effectively improved by involving the degrees of optimization freedom from the coded aperture. The superiority of the proposed method is assessed on some public hyperspectral datasets over the state-of-the-art HIC methods.


  Access Paper or Ask Questions

Multivariate Time-series Anomaly Detection via Graph Attention Network

Sep 04, 2020
Hang Zhao, Yujing Wang, Juanyong Duan, Congrui Huang, Defu Cao, Yunhai Tong, Bixiong Xu, Jing Bai, Jie Tong, Qi Zhang

Anomaly detection on multivariate time-series is of great importance in both data mining research and industrial applications. Recent approaches have achieved significant progress in this topic, but there is remaining limitations. One major limitation is that they do not capture the relationships between different time-series explicitly, resulting in inevitable false alarms. In this paper, we propose a novel self-supervised framework for multivariate time-series anomaly detection to address this issue. Our framework considers each univariate time-series as an individual feature and includes two graph attention layers in parallel to learn the complex dependencies of multivariate time-series in both temporal and feature dimensions. In addition, our approach jointly optimizes a forecasting-based model and are construction-based model, obtaining better time-series representations through a combination of single-timestamp prediction and reconstruction of the entire time-series. We demonstrate the efficacy of our model through extensive experiments. The proposed method outperforms other state-of-the-art models on three real-world datasets. Further analysis shows that our method has good interpretability and is useful for anomaly diagnosis.

* Accepted by ICDM 2020. 10 pages 

  Access Paper or Ask Questions

An Enhanced Text Classification to Explore Health based Indian Government Policy Tweets

Aug 18, 2020
Aarzoo Dhiman, Durga Toshniwal

Government-sponsored policy-making and scheme generations is one of the means of protecting and promoting the social, economic, and personal development of the citizens. The evaluation of effectiveness of these schemes done by government only provide the statistical information in terms of facts and figures which do not include the in-depth knowledge of public perceptions, experiences and views on the topic. In this research work, we propose an improved text classification framework that classifies the Twitter data of different health-based government schemes. The proposed framework leverages the language representation models (LR models) BERT, ELMO, and USE. However, these LR models have less real-time applicability due to the scarcity of the ample annotated data. To handle this, we propose a novel GloVe word embeddings and class-specific sentiments based text augmentation approach (named Mod-EDA) which boosts the performance of text classification task by increasing the size of labeled data. Furthermore, the trained model is leveraged to identify the level of engagement of citizens towards these policies in different communities such as middle-income and low-income groups.

* Accepted to KDD 2020: Applied Data Science for Healthcare Workshop (4 pages, 2 figures, 2 tables) 

  Access Paper or Ask Questions

<<
410
411
412
413
414
415
416
417
418
419
420
421
422
>>