Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ole-Christoffer Granmo

A Novel Multi-Step Finite-State Automaton for Arbitrarily Deterministic Tsetlin Machine Learning

Jul 04, 2020

K. Darshana Abeyrathna, Ole-Christoffer Granmo, Rishad Shafik, Alex Yakovlev, Adrian Wheeldon, Jie Lei, Morten Goodwin

Figure 1 for A Novel Multi-Step Finite-State Automaton for Arbitrarily Deterministic Tsetlin Machine Learning

Figure 2 for A Novel Multi-Step Finite-State Automaton for Arbitrarily Deterministic Tsetlin Machine Learning

Figure 3 for A Novel Multi-Step Finite-State Automaton for Arbitrarily Deterministic Tsetlin Machine Learning

Figure 4 for A Novel Multi-Step Finite-State Automaton for Arbitrarily Deterministic Tsetlin Machine Learning

Abstract:Due to the high energy consumption and scalability challenges of deep learning, there is a critical need to shift research focus towards dealing with energy consumption constraints. Tsetlin Machines (TMs) are a recent approach to machine learning that has demonstrated significantly reduced energy usage compared to neural networks alike, while performing competitively accuracy-wise on several benchmarks. However, TMs rely heavily on energy-costly random number generation to stochastically guide a team of Tsetlin Automata to a Nash Equilibrium of the TM game. In this paper, we propose a novel finite-state learning automaton that can replace the Tsetlin Automata in TM learning, for increased determinism. The new automaton uses multi-step deterministic state jumps to reinforce sub-patterns. Simultaneously, flipping a coin to skip every $d$'th state update ensures diversification by randomization. The $d$-parameter thus allows the degree of randomization to be finely controlled. E.g., $d=1$ makes every update random and $d=\infty$ makes the automaton completely deterministic. Our empirical results show that, overall, only substantial degrees of determinism reduces accuracy. Energy-wise, random number generation constitutes switching energy consumption of the TM, saving up to 11 mW power for larger datasets with high $d$ values. We can thus use the new $d$-parameter to trade off accuracy against energy consumption, to facilitate low-energy machine learning.

* 10 pages, 8 figures, 7 tables

Via

Access Paper or Ask Questions

Extending the Tsetlin Machine With Integer-Weighted Clauses for Increased Interpretability

May 11, 2020

K. Darshana Abeyrathna, Ole-Christoffer Granmo, Morten Goodwin

Figure 1 for Extending the Tsetlin Machine With Integer-Weighted Clauses for Increased Interpretability

Figure 2 for Extending the Tsetlin Machine With Integer-Weighted Clauses for Increased Interpretability

Figure 3 for Extending the Tsetlin Machine With Integer-Weighted Clauses for Increased Interpretability

Figure 4 for Extending the Tsetlin Machine With Integer-Weighted Clauses for Increased Interpretability

Abstract:Despite significant effort, building models that are both interpretable and accurate is an unresolved challenge for many pattern recognition problems. In general, rule-based and linear models lack accuracy, while deep learning interpretability is based on rough approximations of the underlying inference. Using a linear combination of conjunctive clauses in propositional logic, Tsetlin Machines (TMs) have shown competitive performance on diverse benchmarks. However, to do so, many clauses are needed, which impacts interpretability. Here, we address the accuracy-interpretability challenge in machine learning by equipping the TM clauses with integer weights. The resulting Integer Weighted TM (IWTM) deals with the problem of learning which clauses are inaccurate and thus must team up to obtain high accuracy as a team (low weight clauses), and which clauses are sufficiently accurate to operate more independently (high weight clauses). Since each TM clause is formed adaptively by a team of Tsetlin Automata, identifying effective weights becomes a challenging online learning problem. We address this problem by extending each team of Tsetlin Automata with a stochastic searching on the line (SSL) automaton. In our novel scheme, the SSL automaton learns the weight of its clause in interaction with the corresponding Tsetlin Automata team, which, in turn, adapts the composition of the clause by the adjusting weight. We evaluate IWTM empirically using five datasets, including a study of interpetability. On average, IWTM uses 6.5 times fewer literals than the vanilla TM and 120 times fewer literals than a TM with real-valued weights. Furthermore, in terms of average F1-Score, IWTM outperforms simple Multi-Layered Artificial Neural Networks, Decision Trees, Support Vector Machines, K-Nearest Neighbor, Random Forest, XGBoost, Explainable Boosting Machines, and standard and real-value weighted TMs.

* 20 pages, 10 figures

Via

Access Paper or Ask Questions

Increasing the Inference and Learning Speed of Tsetlin Machines with Clause Indexing

Apr 07, 2020

Saeed Rahimi Gorji, Ole-Christoffer Granmo, Sondre Glimsdal, Jonathan Edwards, Morten Goodwin

Figure 1 for Increasing the Inference and Learning Speed of Tsetlin Machines with Clause Indexing

Figure 2 for Increasing the Inference and Learning Speed of Tsetlin Machines with Clause Indexing

Figure 3 for Increasing the Inference and Learning Speed of Tsetlin Machines with Clause Indexing

Figure 4 for Increasing the Inference and Learning Speed of Tsetlin Machines with Clause Indexing

Abstract:The Tsetlin Machine (TM) is a machine learning algorithm founded on the classical Tsetlin Automaton (TA) and game theory. It further leverages frequent pattern mining and resource allocation principles to extract common patterns in the data, rather than relying on minimizing output error, which is prone to overfitting. Unlike the intertwined nature of pattern representation in neural networks, a TM decomposes problems into self-contained patterns, represented as conjunctive clauses. The clause outputs, in turn, are combined into a classification decision through summation and thresholding, akin to a logistic regression function, however, with binary weights and a unit step output function. In this paper, we exploit this hierarchical structure by introducing a novel algorithm that avoids evaluating the clauses exhaustively. Instead we use a simple look-up table that indexes the clauses on the features that falsify them. In this manner, we can quickly evaluate a large number of clauses through falsification, simply by iterating through the features and using the look-up table to eliminate those clauses that are falsified. The look-up table is further structured so that it facilitates constant time updating, thus supporting use also during learning. We report up to 15 times faster classification and three times faster learning on MNIST and Fashion-MNIST image classification, and IMDb sentiment analysis.

* 14 pages, 8 figures

Via

Access Paper or Ask Questions

A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation

Feb 04, 2020

K. Darshana Abeyrathna, Ole-Christoffer Granmo, Morten Goodwin

Figure 1 for A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation

Figure 2 for A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation

Figure 3 for A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation

Figure 4 for A Regression Tsetlin Machine with Integer Weighted Clauses for Compact Pattern Representation

Abstract:The Regression Tsetlin Machine (RTM) addresses the lack of interpretability impeding state-of-the-art nonlinear regression models. It does this by using conjunctive clauses in propositional logic to capture the underlying non-linear frequent patterns in the data. These, in turn, are combined into a continuous output through summation, akin to a linear regression function, however, with non-linear components and unity weights. Although the RTM has solved non-linear regression problems with competitive accuracy, the resolution of the output is proportional to the number of clauses employed. This means that computation cost increases with resolution. To reduce this problem, we here introduce integer weighted RTM clauses. Our integer weighted clause is a compact representation of multiple clauses that capture the same sub-pattern-N repeating clauses are turned into one, with an integer weight N. This reduces computation cost N times, and increases interpretability through a sparser representation. We further introduce a novel learning scheme that allows us to simultaneously learn both the clauses and their weights, taking advantage of so-called stochastic searching on the line. We evaluate the potential of the integer weighted RTM empirically using six artificial datasets. The results show that the integer weighted RTM is able to acquire on par or better accuracy using significantly less computational resources compared to regular RTMs. We further show that integer weights yield improved accuracy over real-valued ones.

* 12 pages, 4 figures, 2 tables

Via

Access Paper or Ask Questions

The Weighted Tsetlin Machine: Compressed Representations with Weighted Clauses

Jan 14, 2020

Adrian Phoulady, Ole-Christoffer Granmo, Saeed Rahimi Gorji, Hady Ahmady Phoulady

Figure 1 for The Weighted Tsetlin Machine: Compressed Representations with Weighted Clauses

Figure 2 for The Weighted Tsetlin Machine: Compressed Representations with Weighted Clauses

Figure 3 for The Weighted Tsetlin Machine: Compressed Representations with Weighted Clauses

Figure 4 for The Weighted Tsetlin Machine: Compressed Representations with Weighted Clauses

Abstract:The Tsetlin Machine (TM) is an interpretable mechanism for pattern recognition that constructs conjunctive clauses from data. The clauses capture frequent patterns with high discriminating power, providing increasing expression power with each additional clause. However, the resulting accuracy gain comes at the cost of linear growth in computation time and memory usage. In this paper, we present the Weighted Tsetlin Machine (WTM), which reduces computation time and memory usage by weighting the clauses. Real-valued weighting allows one clause to replace multiple, and supports fine-tuning the impact of each clause. Our novel scheme simultaneously learns both the composition of the clauses and their weights. Furthermore, we increase training efficiency by replacing $k$ Bernoulli trials of success probability $p$ with a uniform sample of average size $p k$, the size drawn from a binomial distribution. In our empirical evaluation, the WTM achieved the same accuracy as the TM on MNIST, IMDb, and Connect-4, requiring only $1/4$, $1/3$, and $1/50$ of the clauses, respectively. With the same number of clauses, the WTM outperformed the TM, obtaining peak test accuracies of respectively $98.63\%$, $90.37\%$, and $87.91\%$. Finally, our novel sampling scheme reduced sample generation time by a factor of $7$.

* Accepted at the Ninth International Workshop on Statistical Relational AI

Via

Access Paper or Ask Questions

Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural Networks

Sep 25, 2019

Jivitesh Sharma, Ole-Christoffer Granmo, Morten Goodwin

Figure 1 for Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural Networks

Figure 2 for Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural Networks

Figure 3 for Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural Networks

Figure 4 for Environment Sound Classification using Multiple Feature Channels and Deep Convolutional Neural Networks

Abstract:In this paper, we propose a model for the Environment Sound Classification Task (ESC) that consists of multiple feature channels given as input to a Deep Convolutional Neural Network (CNN). The novelty of the paper lies in using multiple feature channels consisting of Mel-Frequency Cepstral Coefficients (MFCC), Gammatone Frequency Cepstral Coefficients (GFCC), the Constant Q-transform (CQT) and Chromagram. Such multiple features have never been used before for signal or audio processing. Also, we employ a deeper CNN (DCNN) compared to previous models, consisting of 2D separable convolutions working on time and feature domain separately. The model also consists of max pooling layers that downsample time and feature domain separately. We use some data augmentation techniques to further boost performance. Our model is able to achieve state-of-the-art performance on all three benchmark environment sound classification datasets, i.e. the UrbanSound8K (97.35%), ESC-10 (95.75%) and ESC-50 (90.48%). To the best of our knowledge, this is the first time that a single environment sound classification model is able to achieve state-of-the-art results on all three datasets. For ESC-10 and ESC-50 datasets, the accuracy achieved by the proposed model is beyond human accuracy of 95.7% and 81.3% respectively.

* We have corrected the error in the code from the previous version of the paper. The new results still set new state-of-the-art and are more theoretically plausible then before

Via

Access Paper or Ask Questions

A Tsetlin Machine with Multigranular Clauses

Sep 16, 2019

Saeed Rahimi Gorji, Ole-Christoffer Granmo, Adrian Phoulady, Morten Goodwin

Figure 1 for A Tsetlin Machine with Multigranular Clauses

Figure 2 for A Tsetlin Machine with Multigranular Clauses

Figure 3 for A Tsetlin Machine with Multigranular Clauses

Figure 4 for A Tsetlin Machine with Multigranular Clauses

Abstract:The recently introduced Tsetlin Machine (TM) has provided competitive pattern recognition accuracy in several benchmarks, however, requires a 3-dimensional hyperparameter search. In this paper, we introduce the Multigranular Tsetlin Machine (MTM). The MTM eliminates the specificity hyperparameter, used by the TM to control the granularity of the conjunctive clauses that it produces for recognizing patterns. Instead of using a fixed global specificity, we encode varying specificity as part of the clauses, rendering the clauses multigranular. This makes it easier to configure the TM because the dimensionality of the hyperparameter search space is reduced to only two dimensions. Indeed, it turns out that there is significantly less hyperparameter tuning involved in applying the MTM to new problems. Further, we demonstrate empirically that the MTM provides similar performance to what is achieved with a finely specificity-optimized TM, by comparing their performance on both synthetic and real-world datasets.

* 6 pages, 4 figures

Via

Access Paper or Ask Questions

Towards Model-based Reinforcement Learning for Industry-near Environments

Jul 27, 2019

Per-Arne Andersen, Morten Goodwin, Ole-Christoffer Granmo

Figure 1 for Towards Model-based Reinforcement Learning for Industry-near Environments

Figure 2 for Towards Model-based Reinforcement Learning for Industry-near Environments

Figure 3 for Towards Model-based Reinforcement Learning for Industry-near Environments

Figure 4 for Towards Model-based Reinforcement Learning for Industry-near Environments

Abstract:Deep reinforcement learning has over the past few years shown great potential in learning near-optimal control in complex simulated environments with little visible information. Rainbow (Q-Learning) and PPO (Policy Optimisation) have shown outstanding performance in a variety of tasks, including Atari 2600, MuJoCo, and Roboschool test suite. While these algorithms are fundamentally different, both suffer from high variance, low sample efficiency, and hyperparameter sensitivity that in practice, make these algorithms a no-go for critical operations in the industry. On the other hand, model-based reinforcement learning focuses on learning the transition dynamics between states in an environment. If these environment dynamics are adequately learned, a model-based approach is perhaps the most sample efficient method for learning agents to act in an environment optimally. The traits of model-based reinforcement are ideal for real-world environments where sampling is slow and for mission-critical operations. In the warehouse industry, there is an increasing motivation to minimise time and to maximise production. Currently, autonomous agents act suboptimally using handcrafted policies for significant portions of the state-space. In this paper, we present The Dreaming Variational Autoencoder v2 (DVAE-2), a model-based reinforcement learning algorithm that increases sample efficiency, hence enable algorithms with low sample efficiency function better in real-world environments. We introduce Deep Warehouse, a simulated environment for industry-near testing of autonomous agents in grid-based warehouses. Finally, we illustrate that DVAE-2 improves the sample efficiency for the Deep Warehouse compared to model-free methods.

Via

Access Paper or Ask Questions

A Neural Turing~Machine for Conditional Transition Graph Modeling

Jul 15, 2019

Mehdi Ben Lazreg, Morten Goodwin, Ole-Christoffer Granmo

Figure 1 for A Neural Turing~Machine for Conditional Transition Graph Modeling

Figure 2 for A Neural Turing~Machine for Conditional Transition Graph Modeling

Figure 3 for A Neural Turing~Machine for Conditional Transition Graph Modeling

Figure 4 for A Neural Turing~Machine for Conditional Transition Graph Modeling

Abstract:Graphs are an essential part of many machine learning problems such as analysis of parse trees, social networks, knowledge graphs, transportation systems, and molecular structures. Applying machine learning in these areas typically involves learning the graph structure and the relationship between the nodes of the graph. However, learning the graph structure is often complex, particularly when the graph is cyclic, and the transitions from one node to another are conditioned such as graphs used to represent a finite state machine. To solve this problem, we propose to extend the memory based Neural Turing Machine (NTM) with two novel additions. We allow for transitions between nodes to be influenced by information received from external environments, and we let the NTM learn the context of those transitions. We refer to this extension as the Conditional Neural Turing Machine (CNTM). We show that the CNTM can infer conditional transition graphs by empirically verifiying the model on two data sets: a large set of randomly generated graphs, and a graph modeling the information retrieval process during certain crisis situations. The results show that the CNTM is able to reproduce the paths inside the graph with accuracy ranging from 82,12% for 10 nodes graphs to 65,25% for 100 nodes graphs.

* Submitted to IEEE Transactions on Neural Networks and Learning Systems

Via

Access Paper or Ask Questions

The Convolutional Tsetlin Machine

May 25, 2019

Ole-Christoffer Granmo, Sondre Glimsdal, Lei Jiao, Morten Goodwin, Christian W. Omlin, Geir Thore Berge

Figure 1 for The Convolutional Tsetlin Machine

Figure 2 for The Convolutional Tsetlin Machine

Figure 3 for The Convolutional Tsetlin Machine

Figure 4 for The Convolutional Tsetlin Machine

Abstract:Deep neural networks have obtained astounding successes for important pattern recognition tasks, but they suffer from high computational complexity and the lack of interpretability. The recent Tsetlin Machine (TM) attempts to address this lack by using easy-to-interpret conjunctive clauses in propositional logic to solve complex pattern recognition problems. The TM provides competitive accuracy in several benchmarks, while keeping the important property of interpretability. It further facilitates hardware-near implementation since inputs, patterns, and outputs are expressed as bits, while recognition and learning rely on straightforward bit manipulation. In this paper, we exploit the TM paradigm by introducing the Convolutional Tsetlin Machine (CTM), as an interpretable alternative to convolutional neural networks (CNNs). Whereas the TM categorizes an image by employing each clause once to the whole image, the CTM uses each clause as a convolution filter. That is, a clause is evaluated multiple times, once per image patch taking part in the convolution. To make the clauses location-aware, each patch is further augmented with its coordinates within the image. The output of a convolution clause is obtained simply by ORing the outcome of evaluating the clause on each patch. In the learning phase of the TM, clauses that evaluate to 1 are contrasted against the input. For the CTM, we instead contrast against one of the patches, randomly selected among the patches that made the clause evaluate to 1. Accordingly, the standard Type I and Type II feedback of the classic TM can be employed directly, without further modification. The CTM obtains a peak test accuracy of 99.51% on MNIST, 96.21% on Kuzushiji-MNIST, 89.56% on Fashion-MNIST, and 100.0% on the 2D Noisy XOR Problem, which is competitive with results reported for simple 4-layer CNNs, BinaryConnect, and a recent FPGA-accelerated Binary CNN.

* 9 pages, 8 figures

Via

Access Paper or Ask Questions