Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yanjun Qi

General Multi-label Image Classification with Transformers

Nov 27, 2020

Jack Lanchantin, Tianlu Wang, Vicente Ordonez, Yanjun Qi

Figure 1 for General Multi-label Image Classification with Transformers

Figure 2 for General Multi-label Image Classification with Transformers

Figure 3 for General Multi-label Image Classification with Transformers

Figure 4 for General Multi-label Image Classification with Transformers

Abstract:Multi-label image classification is the task of predicting a set of labels corresponding to objects, attributes or other entities present in an image. In this work we propose the Classification Transformer (C-Tran), a general framework for multi-label image classification that leverages Transformers to exploit the complex dependencies among visual features and labels. Our approach consists of a Transformer encoder trained to predict a set of target labels given an input set of masked labels, and visual features from a convolutional neural network. A key ingredient of our method is a label mask training objective that uses a ternary encoding scheme to represent the state of the labels as positive, negative, or unknown during training. Our model shows state-of-the-art performance on challenging datasets such as COCO and Visual Genome. Moreover, because our model explicitly represents the uncertainty of labels during training, it is more general by allowing us to produce improved results for images with partial or extra label annotations during inference. We demonstrate this additional capability in the COCO, Visual Genome, News500, and CUB image datasets.

* 13 pages, 7 figures

Via

Access Paper or Ask Questions

Measuring Visual Generalization in Continuous Control from Pixels

Oct 13, 2020

Jake Grigsby, Yanjun Qi

Figure 1 for Measuring Visual Generalization in Continuous Control from Pixels

Figure 2 for Measuring Visual Generalization in Continuous Control from Pixels

Figure 3 for Measuring Visual Generalization in Continuous Control from Pixels

Figure 4 for Measuring Visual Generalization in Continuous Control from Pixels

Abstract:Self-supervised learning and data augmentation have significantly reduced the performance gap between state and image-based reinforcement learning agents in continuous control tasks. However, it is still unclear whether current techniques can face a variety of visual conditions required by real-world environments. We propose a challenging benchmark that tests agents' visual generalization by adding graphical variety to existing continuous control domains. Our empirical analysis shows that current methods struggle to generalize across a diverse set of visual changes, and we examine the specific factors of variation that make these tasks difficult. We find that data augmentation techniques outperform self-supervised learning approaches and that more significant image transformations provide better visual generalization \footnote{The benchmark and our augmented actor-critic implementation are open-sourced @ https://github.com/jakegrigsby/dmc_remastered)

* A total of 17 pages, 8 pages as the main text

Via

Access Paper or Ask Questions

Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

Oct 12, 2020

Jin Yong Yoo, John X. Morris, Eli Lifland, Yanjun Qi

Figure 1 for Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

Figure 2 for Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

Figure 3 for Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

Figure 4 for Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

Abstract:We study the behavior of several black-box search algorithms used for generating adversarial examples for natural language processing (NLP) tasks. We perform a fine-grained analysis of three elements relevant to search: search algorithm, search space, and search budget. When new search algorithms are proposed in past work, the attack search space is often modified alongside the search algorithm. Without ablation studies benchmarking the search algorithm change with the search space held constant, one cannot tell if an increase in attack success rate is a result of an improved search algorithm or a less restrictive search space. Additionally, many previous studies fail to properly consider the search algorithms' run-time cost, which is essential for downstream tasks like adversarial training. Our experiments provide a reproducible benchmark of search algorithms across a variety of search spaces and query budgets to guide future research in adversarial NLP. Based on our experiments, we recommend greedy attacks with word importance ranking when under a time constraint or attacking long inputs, and either beam search or particle swarm optimization otherwise. Code implementation shared via https://github.com/QData/TextAttack-Search-Benchmark

* 14 pages, 5 figures, 4 tables; Accepted by EMNLP BlackBox NLP Workshop 2020 @ https://blackboxnlp.github.io/cfp.html

Via

Access Paper or Ask Questions

TextAttack: A Framework for Adversarial Attacks in Natural Language Processing

May 13, 2020

John X. Morris, Eli Lifland, Jin Yong Yoo, Yanjun Qi

Figure 1 for TextAttack: A Framework for Adversarial Attacks in Natural Language Processing

Figure 2 for TextAttack: A Framework for Adversarial Attacks in Natural Language Processing

Abstract:TextAttack is a library for running adversarial attacks against natural language processing (NLP) models. TextAttack builds attacks from four components: a search method, goal function, transformation, and a set of constraints. Researchers can use these components to easily assemble new attacks. Individual components can be isolated and compared for easier ablation studies. TextAttack currently supports attacks on models trained for text classification and entailment across a variety of datasets. Additionally, TextAttack's modular design makes it easily extensible to new NLP tasks, models, and attack strategies. TextAttack code and tutorials are available at https://github.com/QData/TextAttack.

* 6 pages. More details are shared at https://github.com/QData/TextAttack

Via

Access Paper or Ask Questions

Reevaluating Adversarial Examples in Natural Language

Apr 25, 2020

John X. Morris, Eli Lifland, Jack Lanchantin, Yangfeng Ji, Yanjun Qi

Figure 1 for Reevaluating Adversarial Examples in Natural Language

Figure 2 for Reevaluating Adversarial Examples in Natural Language

Figure 3 for Reevaluating Adversarial Examples in Natural Language

Figure 4 for Reevaluating Adversarial Examples in Natural Language

Abstract:State-of-the-art attacks on NLP models have different definitions of what constitutes a successful attack. These differences make the attacks difficult to compare. We propose to standardize definitions of natural language adversarial examples based on a set of linguistic constraints: semantics, grammaticality, edit distance, and non-suspicion. We categorize previous attacks based on these constraints. For each constraint, we suggest options for human and automatic evaluation methods. We use these methods to evaluate two state-of-the-art synonym substitution attacks. We find that perturbations often do not preserve semantics, and 45\% introduce grammatical errors. Next, we conduct human studies to find a threshold for each evaluation method that aligns with human judgment. Human surveys reveal that to truly preserve semantics, we need to significantly increase the minimum cosine similarity between the embeddings of swapped words and sentence encodings of original and perturbed inputs. After tightening these constraints to agree with the judgment of our human annotators, the attacks produce valid, successful adversarial examples. But quality comes at a cost: attack success rate drops by over 70 percentage points. Finally, we introduce TextAttack, a library for adversarial attacks in NLP.

* 14 pages; 10 Tables; 4 Figures

Via

Access Paper or Ask Questions

Differential Network Learning Beyond Data Samples

Apr 24, 2020

Arshdeep Sekhon, Beilun Wang, Zhe Wang, Yanjun Qi

Figure 1 for Differential Network Learning Beyond Data Samples

Figure 2 for Differential Network Learning Beyond Data Samples

Figure 3 for Differential Network Learning Beyond Data Samples

Figure 4 for Differential Network Learning Beyond Data Samples

Abstract:Learning the change of statistical dependencies between random variables is an essential task for many real-life applications, mostly in the high dimensional low sample regime. In this paper, we propose a novel differential parameter estimator that, in comparison to current methods, simultaneously allows (a) the flexible integration of multiple sources of information (data samples, variable groupings, extra pairwise evidence, etc.), (b) being scalable to a large number of variables, and (c) achieving a sharp asymptotic convergence rate. Our experiments, on more than 100 simulated and two real-world datasets, validate the flexibility of our approach and highlight the benefits of integrating spatial and anatomic information for brain connectome change discovery and epigenetic network identification.

* 9 pages of main draft; 25 pages of Appendix; 5 Tables ; 14 Figures ; Learning of Structure Difference between Two Graphical Models

Via

Access Paper or Ask Questions

Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning

Jan 16, 2020

Paola Cascante-Bonilla, Fuwen Tan, Yanjun Qi, Vicente Ordonez

Figure 1 for Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning

Figure 2 for Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning

Figure 3 for Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning

Figure 4 for Curriculum Labeling: Self-paced Pseudo-Labeling for Semi-Supervised Learning

Abstract:Semi-supervised learning aims to take advantage of a large amount of unlabeled data to improve the accuracy of a model that only has access to a small number of labeled examples. We propose curriculum labeling, an approach that exploits pseudo-labeling for propagating labels to unlabeled samples in an iterative and self-paced fashion. This approach is surprisingly simple and effective and surpasses or is comparable with the best methods proposed in the recent literature across all the standard benchmarks for image classification. Notably, we obtain 94.91% accuracy on CIFAR-10 using only 4,000 labeled samples, and 88.56% top-5 accuracy on Imagenet-ILSVRC using 128,000 labeled samples. In contrast to prior works, our approach shows improvements even in a more realistic scenario that leverages out-of-distribution unlabeled data samples.

Via

Access Paper or Ask Questions

Neural Message Passing for Multi-Label Classification

Apr 17, 2019

Jack Lanchantin, Arshdeep Sekhon, Yanjun Qi

Figure 1 for Neural Message Passing for Multi-Label Classification

Figure 2 for Neural Message Passing for Multi-Label Classification

Figure 3 for Neural Message Passing for Multi-Label Classification

Figure 4 for Neural Message Passing for Multi-Label Classification

Abstract:Multi-label classification (MLC) is the task of assigning a set of target labels for a given sample. Modeling the combinatorial label interactions in MLC has been a long-haul challenge. We propose Label Message Passing (LaMP) Neural Networks to efficiently model the joint prediction of multiple labels. LaMP treats labels as nodes on a label-interaction graph and computes the hidden representation of each label node conditioned on the input using attention-based neural message passing. Attention enables LaMP to assign different importance to neighbor nodes per label, learning how labels interact (implicitly). The proposed models are simple, accurate, interpretable, structure-agnostic, and applicable for predicting dense labels since LaMP is incredibly parallelizable. We validate the benefits of LaMP on seven real-world MLC datasets, covering a broad spectrum of input/output types and outperforming the state-of-the-art results. Notably, LaMP enables intuitive interpretation of how classifying each label depends on the elements of a sample and at the same time rely on its interaction with other labels. We provide our code and datasets at https://github.com/QData/LaMP

* 19pages. We provide our code and datasets at https://github.com/QData/LaMP

Via

Access Paper or Ask Questions

A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models

Jul 17, 2018

Beilun Wang, Arshdeep Sekhon, Yanjun Qi

Figure 1 for A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models

Figure 2 for A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models

Figure 3 for A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models

Figure 4 for A Fast and Scalable Joint Estimator for Integrating Additional Knowledge in Learning Multiple Related Sparse Gaussian Graphical Models

Abstract:We consider the problem of including additional knowledge in estimating sparse Gaussian graphical models (sGGMs) from aggregated samples, arising often in bioinformatics and neuroimaging applications. Previous joint sGGM estimators either fail to use existing knowledge or cannot scale-up to many tasks (large $K$) under a high-dimensional (large $p$) situation. In this paper, we propose a novel \underline{J}oint \underline{E}lementary \underline{E}stimator incorporating additional \underline{K}nowledge (JEEK) to infer multiple related sparse Gaussian Graphical models from large-scale heterogeneous data. Using domain knowledge as weights, we design a novel hybrid norm as the minimization objective to enforce the superposition of two weighted sparsity constraints, one on the shared interactions and the other on the task-specific structural patterns. This enables JEEK to elegantly consider various forms of existing knowledge based on the domain at hand and avoid the need to design knowledge-specific optimization. JEEK is solved through a fast and entry-wise parallelizable solution that largely improves the computational efficiency of the state-of-the-art $O(p^5K^4)$ to $O(p^2K^4)$. We conduct a rigorous statistical analysis showing that JEEK achieves the same convergence rate $O(\log(Kp)/n_{tot})$ as the state-of-the-art estimators that are much harder to compute. Empirically, on multiple synthetic datasets and two real-world data, JEEK outperforms the speed of the state-of-arts significantly while achieving the same level of prediction accuracy. Available as R tool "jeek"

* ICML 2018; Proof and Design of W in Appendix; Available as R tool "jeek"

Via

Access Paper or Ask Questions

DeepDiff: Deep-learning for predicting Differential gene expression from histone modifications

Jul 10, 2018

Arshdeep Sekhon, Ritambhara Singh, Yanjun Qi

Figure 1 for DeepDiff: Deep-learning for predicting Differential gene expression from histone modifications

Figure 2 for DeepDiff: Deep-learning for predicting Differential gene expression from histone modifications

Figure 3 for DeepDiff: Deep-learning for predicting Differential gene expression from histone modifications

Figure 4 for DeepDiff: Deep-learning for predicting Differential gene expression from histone modifications

Abstract:Computational methods that predict differential gene expression from histone modification signals are highly desirable for understanding how histone modifications control the functional heterogeneity of cells through influencing differential gene regulation. Recent studies either failed to capture combinatorial effects on differential prediction or primarily only focused on cell type-specific analysis. In this paper, we develop a novel attention-based deep learning architecture, DeepDiff, that provides a unified and end-to-end solution to model and to interpret how dependencies among histone modifications control the differential patterns of gene regulation. DeepDiff uses a hierarchy of multiple Long short-term memory (LSTM) modules to encode the spatial structure of input signals and to model how various histone modifications cooperate automatically. We introduce and train two levels of attention jointly with the target prediction, enabling DeepDiff to attend differentially to relevant modifications and to locate important genome positions for each modification. Additionally, DeepDiff introduces a novel deep-learning based multi-task formulation to use the cell-type-specific gene expression predictions as auxiliary tasks, encouraging richer feature embeddings in our primary task of differential expression prediction. Using data from Roadmap Epigenomics Project (REMC) for ten different pairs of cell types, we show that DeepDiff significantly outperforms the state-of-the-art baselines for differential gene expression prediction. The learned attention weights are validated by observations from previous studies about how epigenetic mechanisms connect to differential gene expression. Codes and results are available at \url{deepchrome.org}

Via

Access Paper or Ask Questions