Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ole-Christoffer Granmo

Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders

Oct 23, 2023

Daniel Biermann, Fabrizio Palumbo, Morten Goodwin, Ole-Christoffer Granmo

Figure 1 for Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders

Figure 2 for Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders

Figure 3 for Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders

Figure 4 for Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders

Abstract:Many machine learning models use the manipulation of dimensions as a driving force to enable models to identify and learn important features in data. In the case of sequential data this manipulation usually happens on the token dimension level. Despite the fact that many tasks require a change in sequence length itself, the step of sequence length reduction usually happens out of necessity and in a single step. As far as we are aware, no model uses the sequence length reduction step as an additional opportunity to tune the models performance. In fact, sequence length manipulation as a whole seems to be an overlooked direction. In this study we introduce a novel attention-based method that allows for the direct manipulation of sequence lengths. To explore the method's capabilities, we employ it in an autoencoder model. The autoencoder reduces the input sequence to a smaller sequence in latent space. It then aims to reproduce the original sequence from this reduced form. In this setting, we explore the methods reduction performance for different input and latent sequence lengths. We are able to show that the autoencoder retains all the significant information when reducing the original sequence to half its original size. When reducing down to as low as a quarter of its original size, the autoencoder is still able to reproduce the original sequence with an accuracy of around 90%.

* 8 pages, 5 images, 1 table

Via

Access Paper or Ask Questions

Contracting Tsetlin Machine with Absorbing Automata

Oct 17, 2023

Bimal Bhattarai, Ole-Christoffer Granmo, Lei Jiao, Per-Arne Andersen, Svein Anders Tunheim, Rishad Shafik, Alex Yakovlev

Figure 1 for Contracting Tsetlin Machine with Absorbing Automata

Figure 2 for Contracting Tsetlin Machine with Absorbing Automata

Figure 3 for Contracting Tsetlin Machine with Absorbing Automata

Figure 4 for Contracting Tsetlin Machine with Absorbing Automata

Abstract:In this paper, we introduce a sparse Tsetlin Machine (TM) with absorbing Tsetlin Automata (TA) states. In brief, the TA of each clause literal has both an absorbing Exclude- and an absorbing Include state, making the learning scheme absorbing instead of ergodic. When a TA reaches an absorbing state, it will never leave that state again. If the absorbing state is an Exclude state, both the automaton and the literal can be removed from further consideration. The literal will as a result never participates in that clause. If the absorbing state is an Include state, on the other hand, the literal is stored as a permanent part of the clause while the TA is discarded. A novel sparse data structure supports these updates by means of three action lists: Absorbed Include, Include, and Exclude. By updating these lists, the TM gets smaller and smaller as the literals and their TA withdraw. In this manner, the computation accelerates during learning, leading to faster learning and less energy consumption.

* Accepted to ISTM2023. 7 pages, 8 figures

Via

Access Paper or Ask Questions

Generalized Convergence Analysis of Tsetlin Machines: A Probabilistic Approach to Concept Learning

Oct 03, 2023

Mohamed-Bachir Belaid, Jivitesh Sharma, Lei Jiao, Ole-Christoffer Granmo, Per-Arne Andersen, Anis Yazidi

Figure 1 for Generalized Convergence Analysis of Tsetlin Machines: A Probabilistic Approach to Concept Learning

Figure 2 for Generalized Convergence Analysis of Tsetlin Machines: A Probabilistic Approach to Concept Learning

Figure 3 for Generalized Convergence Analysis of Tsetlin Machines: A Probabilistic Approach to Concept Learning

Figure 4 for Generalized Convergence Analysis of Tsetlin Machines: A Probabilistic Approach to Concept Learning

Abstract:Tsetlin Machines (TMs) have garnered increasing interest for their ability to learn concepts via propositional formulas and their proven efficiency across various application domains. Despite this, the convergence proof for the TMs, particularly for the AND operator (\emph{conjunction} of literals), in the generalized case (inputs greater than two bits) remains an open problem. This paper aims to fill this gap by presenting a comprehensive convergence analysis of Tsetlin automaton-based Machine Learning algorithms. We introduce a novel framework, referred to as Probabilistic Concept Learning (PCL), which simplifies the TM structure while incorporating dedicated feedback mechanisms and dedicated inclusion/exclusion probabilities for literals. Given $n$ features, PCL aims to learn a set of conjunction clauses $C_i$ each associated with a distinct inclusion probability $p_i$. Most importantly, we establish a theoretical proof confirming that, for any clause $C_k$, PCL converges to a conjunction of literals when $0.5<p_k<1$. This result serves as a stepping stone for future research on the convergence properties of Tsetlin automaton-based learning algorithms. Our findings not only contribute to the theoretical understanding of Tsetlin Machines but also have implications for their practical application, potentially leading to more robust and interpretable machine learning models.

Via

Access Paper or Ask Questions

Learning Minimalistic Tsetlin Machine Clauses with Markov Boundary-Guided Pruning

Sep 12, 2023

Ole-Christoffer Granmo, Per-Arne Andersen, Lei Jiao, Xuan Zhang, Christian Blakely, Tor Tveit

Figure 1 for Learning Minimalistic Tsetlin Machine Clauses with Markov Boundary-Guided Pruning

Figure 2 for Learning Minimalistic Tsetlin Machine Clauses with Markov Boundary-Guided Pruning

Figure 3 for Learning Minimalistic Tsetlin Machine Clauses with Markov Boundary-Guided Pruning

Figure 4 for Learning Minimalistic Tsetlin Machine Clauses with Markov Boundary-Guided Pruning

Abstract:A set of variables is the Markov blanket of a random variable if it contains all the information needed for predicting the variable. If the blanket cannot be reduced without losing useful information, it is called a Markov boundary. Identifying the Markov boundary of a random variable is advantageous because all variables outside the boundary are superfluous. Hence, the Markov boundary provides an optimal feature set. However, learning the Markov boundary from data is challenging for two reasons. If one or more variables are removed from the Markov boundary, variables outside the boundary may start providing information. Conversely, variables within the boundary may stop providing information. The true role of each candidate variable is only manifesting when the Markov boundary has been identified. In this paper, we propose a new Tsetlin Machine (TM) feedback scheme that supplements Type I and Type II feedback. The scheme introduces a novel Finite State Automaton - a Context-Specific Independence Automaton. The automaton learns which features are outside the Markov boundary of the target, allowing them to be pruned from the TM during learning. We investigate the new scheme empirically, showing how it is capable of exploiting context-specific independence to find Markov boundaries. Further, we provide a theoretical analysis of convergence. Our approach thus connects the field of Bayesian networks (BN) with TMs, potentially opening up for synergies when it comes to inference and learning, including TM-produced Bayesian knowledge bases and TM-based Bayesian inference.

* Accepted to ISTM2023, 8 pages, 6 figures

Via

Access Paper or Ask Questions

TMComposites: Plug-and-Play Collaboration Between Specialized Tsetlin Machines

Sep 12, 2023

Ole-Christoffer Granmo

Figure 1 for TMComposites: Plug-and-Play Collaboration Between Specialized Tsetlin Machines

Figure 2 for TMComposites: Plug-and-Play Collaboration Between Specialized Tsetlin Machines

Figure 3 for TMComposites: Plug-and-Play Collaboration Between Specialized Tsetlin Machines

Figure 4 for TMComposites: Plug-and-Play Collaboration Between Specialized Tsetlin Machines

Abstract:Tsetlin Machines (TMs) provide a fundamental shift from arithmetic-based to logic-based machine learning. Supporting convolution, they deal successfully with image classification datasets like MNIST, Fashion-MNIST, and CIFAR-2. However, the TM struggles with getting state-of-the-art performance on CIFAR-10 and CIFAR-100, representing more complex tasks. This paper introduces plug-and-play collaboration between specialized TMs, referred to as TM Composites. The collaboration relies on a TM's ability to specialize during learning and to assess its competence during inference. When teaming up, the most confident TMs make the decisions, relieving the uncertain ones. In this manner, a TM Composite becomes more competent than its members, benefiting from their specializations. The collaboration is plug-and-play in that members can be combined in any way, at any time, without fine-tuning. We implement three TM specializations in our empirical evaluation: Histogram of Gradients, Adaptive Gaussian Thresholding, and Color Thermometers. The resulting TM Composite increases accuracy on Fashion-MNIST by two percentage points, CIFAR-10 by twelve points, and CIFAR-100 by nine points, yielding new state-of-the-art results for TMs. Overall, we envision that TM Composites will enable an ultra-low energy and transparent alternative to state-of-the-art deep learning on more tasks and datasets.

* 8 pages, 6 figures

Via

Access Paper or Ask Questions

An FPGA Architecture for Online Learning using the Tsetlin Machine

Jun 01, 2023

Samuel Prescott, Adrian Wheeldon, Rishad Shafik, Tousif Rahman, Alex Yakovlev, Ole-Christoffer Granmo

Figure 1 for An FPGA Architecture for Online Learning using the Tsetlin Machine

Figure 2 for An FPGA Architecture for Online Learning using the Tsetlin Machine

Figure 3 for An FPGA Architecture for Online Learning using the Tsetlin Machine

Figure 4 for An FPGA Architecture for Online Learning using the Tsetlin Machine

Abstract:There is a need for machine learning models to evolve in unsupervised circumstances. New classifications may be introduced, unexpected faults may occur, or the initial dataset may be small compared to the data-points presented to the system during normal operation. Implementing such a system using neural networks involves significant mathematical complexity, which is a major issue in power-critical edge applications. This paper proposes a novel field-programmable gate-array infrastructure for online learning, implementing a low-complexity machine learning algorithm called the Tsetlin Machine. This infrastructure features a custom-designed architecture for run-time learning management, providing on-chip offline and online learning. Using this architecture, training can be carried out on-demand on the \ac{FPGA} with pre-classified data before inference takes place. Additionally, our architecture provisions online learning, where training can be interleaved with inference during operation. Tsetlin Machine (TM) training naturally descends to an optimum, with training also linked to a threshold hyper-parameter which is used to reduce the probability of issuing feedback as the TM becomes trained further. The proposed architecture is modular, allowing the data input source to be easily changed, whilst inbuilt cross-validation infrastructure allows for reliable and representative results during system testing. We present use cases for online learning using the proposed infrastructure and demonstrate the energy/performance/accuracy trade-offs.

Via

Access Paper or Ask Questions

Energy-frugal and Interpretable AI Hardware Design using Learning Automata

May 19, 2023

Rishad Shafik, Tousif Rahman, Adrian Wheeldon, Ole-Christoffer Granmo, Alex Yakovlev

Figure 1 for Energy-frugal and Interpretable AI Hardware Design using Learning Automata

Figure 2 for Energy-frugal and Interpretable AI Hardware Design using Learning Automata

Figure 3 for Energy-frugal and Interpretable AI Hardware Design using Learning Automata

Figure 4 for Energy-frugal and Interpretable AI Hardware Design using Learning Automata

Abstract:Energy efficiency is a crucial requirement for enabling powerful artificial intelligence applications at the microedge. Hardware acceleration with frugal architectural allocation is an effective method for reducing energy. Many emerging applications also require the systems design to incorporate interpretable decision models to establish responsibility and transparency. The design needs to provision for additional resources to provide reachable states in real-world data scenarios, defining conflicting design tradeoffs between energy efficiency. is challenging. Recently a new machine learning algorithm, called the Tsetlin machine, has been proposed. The algorithm is fundamentally based on the principles of finite-state automata and benefits from natural logic underpinning rather than arithmetic. In this paper, we investigate methods of energy-frugal artificial intelligence hardware design by suitably tuning the hyperparameters, while maintaining high learning efficacy. To demonstrate interpretability, we use reachability and game-theoretic analysis in two simulation environments: a SystemC model to study the bounded state transitions in the presence of hardware faults and Nash equilibrium between states to analyze the learning convergence. Our analyses provides the first insights into conflicting design tradeoffs involved in energy-efficient and interpretable decision models for this new artificial intelligence hardware architecture. We show that frugal resource allocation coupled with systematic prodigality between randomized reinforcements can provide decisive energy reduction while also achieving robust and interpretable learning.

Via

Access Paper or Ask Questions

Verifying Properties of Tsetlin Machines

Mar 25, 2023

Emilia Przybysz, Bimal Bhattarai, Cosimo Persia, Ana Ozaki, Ole-Christoffer Granmo, Jivitesh Sharma

Figure 1 for Verifying Properties of Tsetlin Machines

Figure 2 for Verifying Properties of Tsetlin Machines

Figure 3 for Verifying Properties of Tsetlin Machines

Figure 4 for Verifying Properties of Tsetlin Machines

Abstract:Tsetlin Machines (TsMs) are a promising and interpretable machine learning method which can be applied for various classification tasks. We present an exact encoding of TsMs into propositional logic and formally verify properties of TsMs using a SAT solver. In particular, we introduce in this work a notion of similarity of machine learning models and apply our notion to check for similarity of TsMs. We also consider notions of robustness and equivalence from the literature and adapt them for TsMs. Then, we show the correctness of our encoding and provide results for the properties: adversarial robustness, equivalence, and similarity of TsMs. In our experiments, we employ the MNIST and IMDB datasets for (respectively) image and sentiment classification. We discuss the results for verifying robustness obtained with TsMs with those in the literature obtained with Binarized Neural Networks on MNIST.

* 12 pages

Via

Access Paper or Ask Questions

Interpretable Tsetlin Machine-based Premature Ventricular Contraction Identification

Jan 20, 2023

Jinbao Zhang, Xuan Zhang, Lei Jiao, Ole-Christoffer Granmo, Yongjun Qian, Fan Pan

Figure 1 for Interpretable Tsetlin Machine-based Premature Ventricular Contraction Identification

Figure 2 for Interpretable Tsetlin Machine-based Premature Ventricular Contraction Identification

Figure 3 for Interpretable Tsetlin Machine-based Premature Ventricular Contraction Identification

Figure 4 for Interpretable Tsetlin Machine-based Premature Ventricular Contraction Identification

Abstract:Neural network-based models have found wide use in automatic long-term electrocardiogram (ECG) analysis. However, such black box models are inadequate for analysing physiological signals where credibility and interpretability are crucial. Indeed, how to make ECG analysis transparent is still an open problem. In this study, we develop a Tsetlin machine (TM) based architecture for premature ventricular contraction (PVC) identification by analysing long-term ECG signals. The architecture is transparent by describing patterns directly with logical AND rules. To validate the accuracy of our approach, we compare the TM performance with those of convolutional neural networks (CNNs). Our numerical results demonstrate that TM provides comparable performance with CNNs on the MIT-BIH database. To validate interpretability, we provide explanatory diagrams that show how TM makes the PVC identification from confirming and invalidating patterns. We argue that these are compatible with medical knowledge so that they can be readily understood and verified by a medical doctor. Accordingly, we believe this study paves the way for machine learning (ML) for ECG analysis in clinical practice.

Via

Access Paper or Ask Questions

Building Concise Logical Patterns by Constraining Tsetlin Machine Clause Size

Jan 19, 2023

K. Darshana Abeyrathna, Ahmed Abdulrahem Othman Abouzeid, Bimal Bhattarai, Charul Giri, Sondre Glimsdal, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, Jivitesh Sharma, Svein Anders Tunheim(+1 more)

Abstract:Tsetlin machine (TM) is a logic-based machine learning approach with the crucial advantages of being transparent and hardware-friendly. While TMs match or surpass deep learning accuracy for an increasing number of applications, large clause pools tend to produce clauses with many literals (long clauses). As such, they become less interpretable. Further, longer clauses increase the switching activity of the clause logic in hardware, consuming more power. This paper introduces a novel variant of TM learning - Clause Size Constrained TMs (CSC-TMs) - where one can set a soft constraint on the clause size. As soon as a clause includes more literals than the constraint allows, it starts expelling literals. Accordingly, oversized clauses only appear transiently. To evaluate CSC-TM, we conduct classification, clustering, and regression experiments on tabular data, natural language text, images, and board games. Our results show that CSC-TM maintains accuracy with up to 80 times fewer literals. Indeed, the accuracy increases with shorter clauses for TREC, IMDb, and BBC Sports. After the accuracy peaks, it drops gracefully as the clause size approaches a single literal. We finally analyze CSC-TM power consumption and derive new convergence properties.

* 17 pages, 4 figures

Via

Access Paper or Ask Questions