Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bojan Žunkovič

Positive unlabeled learning with tensor networks

Nov 25, 2022

Bojan Žunkovič

Figure 1 for Positive unlabeled learning with tensor networks

Figure 2 for Positive unlabeled learning with tensor networks

Figure 3 for Positive unlabeled learning with tensor networks

Figure 4 for Positive unlabeled learning with tensor networks

Abstract:Positive unlabeled learning is a binary classification problem with positive and unlabeled data. It is common in domains where negative labels are costly or impossible to obtain, e.g., medicine and personalized advertising. We apply the locally purified state tensor network to the positive unlabeled learning problem and test our model on the MNIST image and 15 categorical/mixed datasets. On the MNIST dataset, we achieve state-of-the-art results even with very few labeled positive samples. Similarly, we significantly improve the state-of-the-art on categorical datasets. Further, we show that the agreement fraction between outputs of different models on unlabeled samples is a good indicator of the model's performance. Finally, our method can generate new positive and negative instances, which we demonstrate on simple synthetic datasets.

* 12 pages, 5 figures, 4 tables

Via

Access Paper or Ask Questions

Grokking phase transitions in learning local rules with gradient descent

Oct 26, 2022

Bojan Žunkovič, Enej Ilievski

Abstract:We discuss two solvable grokking (generalisation beyond overfitting) models in a rule learning scenario. We show that grokking is a phase transition and find exact analytic expressions for the critical exponents, grokking probability, and grokking time distribution. Further, we introduce a tensor-network map that connects the proposed grokking setup with the standard (perceptron) statistical learning theory and show that grokking is a consequence of the locality of the teacher model. As an example, we analyse the cellular automata learning task, numerically determine the critical exponent and the grokking time distributions and compare them with the prediction of the proposed grokking model. Finally, we numerically analyse the connection between structure formation and grokking.

* 31+10 pages, 22 figures

Via

Access Paper or Ask Questions

Deep tensor networks with matrix product operators

Sep 16, 2022

Bojan Žunkovič

Abstract:We introduce deep tensor networks, which are exponentially wide neural networks based on the tensor network representation of the weight matrices. We evaluate the proposed method on the image classification (MNIST, FashionMNIST) and sequence prediction (cellular automata) tasks. In the image classification case, deep tensor networks improve our matrix product state baselines and achieve 0.49% error rate on MNIST and 8.3% error rate on FashionMNIST. In the sequence prediction case, we demonstrate an exponential improvement in the number of parameters compared to the one-layer tensor network methods. In both cases, we discuss the non-uniform and the uniform tensor network models and show that the latter generalizes well to different input sizes.

* Quantum Mach. Intell. 4, 21 (2022)
* 9+2 pages, 8 figures

Via

Access Paper or Ask Questions