Alert button
Picture for Tim Oates

Tim Oates

Alert button

cuSLINK: Single-linkage Agglomerative Clustering on the GPU

Jun 28, 2023
Corey J. Nolet, Divye Gala, Alex Fender, Mahesh Doijade, Joe Eaton, Edward Raff, John Zedlewski, Brad Rees, Tim Oates

In this paper, we propose cuSLINK, a novel and state-of-the-art reformulation of the SLINK algorithm on the GPU which requires only $O(Nk)$ space and uses a parameter $k$ to trade off space and time. We also propose a set of novel and reusable building blocks that compose cuSLINK. These building blocks include highly optimized computational patterns for $k$-NN graph construction, spanning trees, and dendrogram cluster extraction. We show how we used our primitives to implement cuSLINK end-to-end on the GPU, further enabling a wide range of real-world data mining and machine learning applications that were once intractable. In addition to being a primary computational bottleneck in the popular HDBSCAN algorithm, the impact of our end-to-end cuSLINK algorithm spans a large range of important applications, including cluster analysis in social and computer networks, natural language processing, and computer vision. Users can obtain cuSLINK at https://docs.rapids.ai/api/cuml/latest/api/#agglomerative-clustering

* To appear in ECML PKDD 2023 by Springer Nature 
Viaarxiv icon

Recasting Self-Attention with Holographic Reduced Representations

May 31, 2023
Mohammad Mahmudul Alam, Edward Raff, Stella Biderman, Tim Oates, James Holt

Figure 1 for Recasting Self-Attention with Holographic Reduced Representations
Figure 2 for Recasting Self-Attention with Holographic Reduced Representations
Figure 3 for Recasting Self-Attention with Holographic Reduced Representations
Figure 4 for Recasting Self-Attention with Holographic Reduced Representations

In recent years, self-attention has become the dominant paradigm for sequence modeling in a variety of domains. However, in domains with very long sequence lengths the $\mathcal{O}(T^2)$ memory and $\mathcal{O}(T^2 H)$ compute costs can make using transformers infeasible. Motivated by problems in malware detection, where sequence lengths of $T \geq 100,000$ are a roadblock to deep learning, we re-cast self-attention using the neuro-symbolic approach of Holographic Reduced Representations (HRR). In doing so we perform the same high-level strategy of the standard self-attention: a set of queries matching against a set of keys, and returning a weighted response of the values for each key. Implemented as a ``Hrrformer'' we obtain several benefits including $\mathcal{O}(T H \log H)$ time complexity, $\mathcal{O}(T H)$ space complexity, and convergence in $10\times$ fewer epochs. Nevertheless, the Hrrformer achieves near state-of-the-art accuracy on LRA benchmarks and we are able to learn with just a single layer. Combined, these benefits make our Hrrformer the first viable Transformer for such long malware classification sequences and up to $280\times$ faster to train on the Long Range Arena benchmark. Code is available at \url{https://github.com/NeuromorphicComputationResearchProgram/Hrrformer}

* To appear in Proceedings of the 40th International Conference on Machine Learning (ICML) 
Viaarxiv icon

RFC-Net: Learning High Resolution Global Features for Medical Image Segmentation on a Computational Budget

Feb 13, 2023
Sourajit Saha, Shaswati Saha, Md Osman Gani, Tim Oates, David Chapman

Figure 1 for RFC-Net: Learning High Resolution Global Features for Medical Image Segmentation on a Computational Budget
Figure 2 for RFC-Net: Learning High Resolution Global Features for Medical Image Segmentation on a Computational Budget

Learning High-Resolution representations is essential for semantic segmentation. Convolutional neural network (CNN)architectures with downstream and upstream propagation flow are popular for segmentation in medical diagnosis. However, due to performing spatial downsampling and upsampling in multiple stages, information loss is inexorable. On the contrary, connecting layers densely on high spatial resolution is computationally expensive. In this work, we devise a Loose Dense Connection Strategy to connect neurons in subsequent layers with reduced parameters. On top of that, using a m-way Tree structure for feature propagation we propose Receptive Field Chain Network (RFC-Net) that learns high resolution global features on a compressed computational space. Our experiments demonstrates that RFC-Net achieves state-of-the-art performance on Kvasir and CVC-ClinicDB benchmarks for Polyp segmentation.

* In Proceedings of AAAI Conference on Artificial Intelligence 2023 
Viaarxiv icon

Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks

Dec 15, 2022
Khondoker Murad Hossain, Tim Oates

Figure 1 for Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks
Figure 2 for Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks
Figure 3 for Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks
Figure 4 for Backdoor Attack Detection in Computer Vision by Applying Matrix Factorization on the Weights of Deep Networks

The increasing importance of both deep neural networks (DNNs) and cloud services for training them means that bad actors have more incentive and opportunity to insert backdoors to alter the behavior of trained models. In this paper, we introduce a novel method for backdoor detection that extracts features from pre-trained DNN's weights using independent vector analysis (IVA) followed by a machine learning classifier. In comparison to other detection techniques, this has a number of benefits, such as not requiring any training data, being applicable across domains, operating with a wide range of network architectures, not assuming the nature of the triggers used to change network behavior, and being highly scalable. We discuss the detection pipeline, and then demonstrate the results on two computer vision datasets regarding image classification and object detection. Our method outperforms the competing algorithms in terms of efficiency and is more accurate, helping to ensure the safe application of deep learning and AI.

* 7 pages, 4 figures, 5 tables, AAAI Workshop on Safe AI 2023 
Viaarxiv icon

Lempel-Ziv Networks

Nov 23, 2022
Rebecca Saul, Mohammad Mahmudul Alam, John Hurwitz, Edward Raff, Tim Oates, James Holt

Figure 1 for Lempel-Ziv Networks
Figure 2 for Lempel-Ziv Networks
Figure 3 for Lempel-Ziv Networks
Figure 4 for Lempel-Ziv Networks

Sequence processing has long been a central area of machine learning research. Recurrent neural nets have been successful in processing sequences for a number of tasks; however, they are known to be both ineffective and computationally expensive when applied to very long sequences. Compression-based methods have demonstrated more robustness when processing such sequences -- in particular, an approach pairing the Lempel-Ziv Jaccard Distance (LZJD) with the k-Nearest Neighbor algorithm has shown promise on long sequence problems (up to $T=200,000,000$ steps) involving malware classification. Unfortunately, use of LZJD is limited to discrete domains. To extend the benefits of LZJD to a continuous domain, we investigate the effectiveness of a deep-learning analog of the algorithm, the Lempel-Ziv Network. While we achieve successful proof of concept, we are unable to improve meaningfully on the performance of a standard LSTM across a variety of datasets and sequence processing tasks. In addition to presenting this negative result, our work highlights the problem of sub-par baseline tuning in newer research areas.

* I Can't Believe It's Not Better Workshop at NeurIPS 2022 
Viaarxiv icon

Towards an Interpretable Hierarchical Agent Framework using Semantic Goals

Oct 16, 2022
Bharat Prakash, Nicholas Waytowich, Tim Oates, Tinoosh Mohsenin

Figure 1 for Towards an Interpretable Hierarchical Agent Framework using Semantic Goals
Figure 2 for Towards an Interpretable Hierarchical Agent Framework using Semantic Goals
Figure 3 for Towards an Interpretable Hierarchical Agent Framework using Semantic Goals
Figure 4 for Towards an Interpretable Hierarchical Agent Framework using Semantic Goals

Learning to solve long horizon temporally extended tasks with reinforcement learning has been a challenge for several years now. We believe that it is important to leverage both the hierarchical structure of complex tasks and to use expert supervision whenever possible to solve such tasks. This work introduces an interpretable hierarchical agent framework by combining planning and semantic goal directed reinforcement learning. We assume access to certain spatial and haptic predicates and construct a simple and powerful semantic goal space. These semantic goal representations are more interpretable, making expert supervision and intervention easier. They also eliminate the need to write complex, dense reward functions thereby reducing human engineering effort. We evaluate our framework on a robotic block manipulation task and show that it performs better than other methods, including both sparse and dense reward functions. We also suggest some next steps and discuss how this framework makes interaction and collaboration with humans easier.

Viaarxiv icon

Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations

Jun 13, 2022
Mohammad Mahmudul Alam, Edward Raff, Tim Oates, James Holt

Figure 1 for Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations
Figure 2 for Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations
Figure 3 for Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations
Figure 4 for Deploying Convolutional Networks on Untrusted Platforms Using 2D Holographic Reduced Representations

Due to the computational cost of running inference for a neural network, the need to deploy the inferential steps on a third party's compute environment or hardware is common. If the third party is not fully trusted, it is desirable to obfuscate the nature of the inputs and outputs, so that the third party can not easily determine what specific task is being performed. Provably secure protocols for leveraging an untrusted party exist but are too computational demanding to run in practice. We instead explore a different strategy of fast, heuristic security that we call Connectionist Symbolic Pseudo Secrets. By leveraging Holographic Reduced Representations (HRR), we create a neural network with a pseudo-encryption style defense that empirically shows robustness to attack, even under threat models that unrealistically favor the adversary.

* To appear in the Proceedings of the 39 th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022 
Viaarxiv icon

Automatic Goal Generation using Dynamical Distance Learning

Nov 07, 2021
Bharat Prakash, Nicholas Waytowich, Tinoosh Mohsenin, Tim Oates

Figure 1 for Automatic Goal Generation using Dynamical Distance Learning
Figure 2 for Automatic Goal Generation using Dynamical Distance Learning
Figure 3 for Automatic Goal Generation using Dynamical Distance Learning
Figure 4 for Automatic Goal Generation using Dynamical Distance Learning

Reinforcement Learning (RL) agents can learn to solve complex sequential decision making tasks by interacting with the environment. However, sample efficiency remains a major challenge. In the field of multi-goal RL, where agents are required to reach multiple goals to solve complex tasks, improving sample efficiency can be especially challenging. On the other hand, humans or other biological agents learn such tasks in a much more strategic way, following a curriculum where tasks are sampled with increasing difficulty level in order to make gradual and efficient learning progress. In this work, we propose a method for automatic goal generation using a dynamical distance function (DDF) in a self-supervised fashion. DDF is a function which predicts the dynamical distance between any two states within a markov decision process (MDP). With this, we generate a curriculum of goals at the appropriate difficulty level to facilitate efficient learning throughout the training process. We evaluate this approach on several goal-conditioned robotic manipulation and navigation tasks, and show improvements in sample efficiency over a baseline method which only uses random goal sampling.

Viaarxiv icon

Interactive Hierarchical Guidance using Language

Oct 09, 2021
Bharat Prakash, Nicholas Waytowich, Tim Oates, Tinoosh Mohsenin

Figure 1 for Interactive Hierarchical Guidance using Language
Figure 2 for Interactive Hierarchical Guidance using Language
Figure 3 for Interactive Hierarchical Guidance using Language
Figure 4 for Interactive Hierarchical Guidance using Language

Reinforcement learning has been successful in many tasks ranging from robotic control, games, energy management etc. In complex real world environments with sparse rewards and long task horizons, sample efficiency is still a major challenge. Most complex tasks can be easily decomposed into high-level planning and low level control. Therefore, it is important to enable agents to leverage the hierarchical structure and decompose bigger tasks into multiple smaller sub-tasks. We introduce an approach where we use language to specify sub-tasks and a high-level planner issues language commands to a low level controller. The low-level controller executes the sub-tasks based on the language commands. Our experiments show that this method is able to solve complex long horizon planning tasks with limited human supervision. Using language has added benefit of interpretability and ability for expert humans to take over the high-level planning task and provide language commands if necessary.

* Presented at AI-HRI symposium as part of AAAI-FSS 2021 (arXiv:2109.10836) 
Viaarxiv icon

Determining Standard Occupational Classification Codes from Job Descriptions in Immigration Petitions

Sep 30, 2021
Sourav Mukherjee, David Widmark, Vince DiMascio, Tim Oates

Figure 1 for Determining Standard Occupational Classification Codes from Job Descriptions in Immigration Petitions
Figure 2 for Determining Standard Occupational Classification Codes from Job Descriptions in Immigration Petitions
Figure 3 for Determining Standard Occupational Classification Codes from Job Descriptions in Immigration Petitions
Figure 4 for Determining Standard Occupational Classification Codes from Job Descriptions in Immigration Petitions

Accurate specification of standard occupational classification (SOC) code is critical to the success of many U.S. work visa applications. Determination of correct SOC code relies on careful study of job requirements and comparison to definitions given by the U.S. Bureau of Labor Statistics, which is often a tedious activity. In this paper, we apply methods from natural language processing (NLP) to computationally determine SOC code based on job description. We implement and empirically evaluate a broad variety of predictive models with respect to quality of prediction and training time, and identify models best suited for this task.

* To appear in ICDM 2021 workshop: MLLD-2021 
Viaarxiv icon