Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Florent Avellaneda

Boolean Matrix Factorization with SAT and MaxSAT

Jun 18, 2021

Florent Avellaneda, Roger Villemaire

Figure 1 for Boolean Matrix Factorization with SAT and MaxSAT

Figure 2 for Boolean Matrix Factorization with SAT and MaxSAT

Figure 3 for Boolean Matrix Factorization with SAT and MaxSAT

Figure 4 for Boolean Matrix Factorization with SAT and MaxSAT

Abstract:The Boolean matrix factorization problem consists in approximating a matrix by the Boolean product of two smaller Boolean matrices. To obtain optimal solutions when the matrices to be factorized are small, we propose SAT and MaxSAT encoding; however, when the matrices to be factorized are large, we propose a heuristic based on the search for maximal biclique edge cover. We experimentally demonstrate that our approaches allow a better factorization than existing approaches while keeping reasonable computation times. Our methods also allow the handling of incomplete matrices with missing entries.

Via

Access Paper or Ask Questions

An Approach to Evaluating Learning Algorithms for Decision Trees

Oct 26, 2020

Tianqi Xiao, Omer Nguena Timo, Florent Avellaneda, Yasir Malik, Stefan Bruda

Figure 1 for An Approach to Evaluating Learning Algorithms for Decision Trees

Figure 2 for An Approach to Evaluating Learning Algorithms for Decision Trees

Figure 3 for An Approach to Evaluating Learning Algorithms for Decision Trees

Figure 4 for An Approach to Evaluating Learning Algorithms for Decision Trees

Abstract:Learning algorithms produce software models for realising critical classification tasks. Decision trees models are simpler than other models such as neural network and they are used in various critical domains such as the medical and the aeronautics. Low or unknown learning ability algorithms does not permit us to trust the produced software models, which lead to costly test activities for validating the models and to the waste of learning time in case the models are likely to be faulty due to the learning inability. Methods for evaluating the decision trees learning ability, as well as that for the other models, are needed especially since the testing of the learned models is still a hot topic. We propose a novel oracle-centered approach to evaluate (the learning ability of) learning algorithms for decision trees. It consists of generating data from reference trees playing the role of oracles, producing learned trees with existing learning algorithms, and determining the degree of correctness (DOE) of the learned trees by comparing them with the oracles. The average DOE is used to estimate the quality of the learning algorithm. the We assess five decision tree learning algorithms based on the proposed approach.

Via

Access Paper or Ask Questions

Learning Optimal Decision Trees from Large Datasets

Apr 12, 2019

Florent Avellaneda

Figure 1 for Learning Optimal Decision Trees from Large Datasets

Figure 2 for Learning Optimal Decision Trees from Large Datasets

Figure 3 for Learning Optimal Decision Trees from Large Datasets

Figure 4 for Learning Optimal Decision Trees from Large Datasets

Abstract:Inferring a decision tree from a given dataset is one of the classic problems in machine learning. This problem consists of buildings, from a labelled dataset, a tree such that each node corresponds to a class and a path between the tree root and a leaf corresponds to a conjunction of features to be satisfied in this class. Following the principle of parsimony, we want to infer a minimal tree consistent with the dataset. Unfortunately, inferring an optimal decision tree is known to be NP-complete for several definitions of optimality. Hence, the majority of existing approaches relies on heuristics, and as for the few exact inference approaches, they do not work on large data sets. In this paper, we propose a novel approach for inferring a decision tree of a minimum depth based on the incremental generation of Boolean formula. The experimental results indicate that it scales sufficiently well and the time it takes to run grows slowly with the size of dataset.

Via

Access Paper or Ask Questions