Alert button
Picture for Samir Loudni

Samir Loudni

Alert button

Exploiting complex pattern features for interactive pattern mining

Apr 08, 2022
Arnold Hien, Samir Loudni, Noureddine Aribi, Abdelkader Ouali, Albrecht Zimmermann

Figure 1 for Exploiting complex pattern features for interactive pattern mining
Figure 2 for Exploiting complex pattern features for interactive pattern mining
Figure 3 for Exploiting complex pattern features for interactive pattern mining
Figure 4 for Exploiting complex pattern features for interactive pattern mining

Recent years have seen a shift from a pattern mining process that has users define constraints before-hand, and sift through the results afterwards, to an interactive one. This new framework depends on exploiting user feedback to learn a quality function for patterns. Existing approaches have a weakness in that they use static pre-defined low-level features, and attempt to learn independent weights representing their importance to the user. As an alternative, we propose to work with more complex features that are derived directly from the pattern ranking imposed by the user. Learned weights are then aggregated onto lower-level features and help to drive the quality function in the right direction. We explore the effect of different parameter choices experimentally and find that using higher-complexity features leads to the selection of patterns that are better aligned with a hidden quality function while not adding significantly to the run times of the method. Getting good user feedback requires to quickly present diverse patterns, something that we achieve but pushing an existing diversity constraint into the sampling component of the interactive mining system LetSip. Resulting patterns allow in most cases to converge to a good solution more quickly. Combining the two improvements, finally, leads to an algorithm showing clear advantages over the existing state-of-the-art.

Viaarxiv icon

An efficient heuristic approach combining maximal itemsets and area measure for compressing voluminous table constraints

Mar 21, 2022
Soufia Bennai, Kamala Amroun, Samir Loudni, Abdelkader Ouali

Figure 1 for An efficient heuristic approach combining maximal itemsets and area measure for compressing voluminous table constraints
Figure 2 for An efficient heuristic approach combining maximal itemsets and area measure for compressing voluminous table constraints
Figure 3 for An efficient heuristic approach combining maximal itemsets and area measure for compressing voluminous table constraints
Figure 4 for An efficient heuristic approach combining maximal itemsets and area measure for compressing voluminous table constraints

Constraint Programming is a powerful paradigm to model and solve combinatorial problems. While there are many kinds of constraints, the table constraint is perhaps the most significant-being the most well-studied and has the ability to encode any other constraints defined on finite variables. However, constraints can be very voluminous and their size can grow exponentially with their arity. To reduce space and the time complexity, researchers have focused on various forms of compression. In this paper we propose a new approach based on maximal frequent itemsets technique and area measure for enumerating the maximal frequent itemsets relevant for compressing table constraints. Our experimental results show the effectiveness and efficiency of this approach on compression and on solving compressed table constraints.

Viaarxiv icon

Boosting the Learning for Ranking Patterns

Mar 05, 2022
Nassim Belmecheri, Noureddine Aribi, Nadjib Lazaar, Yahia Lebbah, Samir Loudni

Figure 1 for Boosting the Learning for Ranking Patterns
Figure 2 for Boosting the Learning for Ranking Patterns
Figure 3 for Boosting the Learning for Ranking Patterns
Figure 4 for Boosting the Learning for Ranking Patterns

Discovering relevant patterns for a particular user remains a challenging tasks in data mining. Several approaches have been proposed to learn user-specific pattern ranking functions. These approaches generalize well, but at the expense of the running time. On the other hand, several measures are often used to evaluate the interestingness of patterns, with the hope to reveal a ranking that is as close as possible to the user-specific ranking. In this paper, we formulate the problem of learning pattern ranking functions as a multicriteria decision making problem. Our approach aggregates different interestingness measures into a single weighted linear ranking function, using an interactive learning procedure that operates in either passive or active modes. A fast learning step is used for eliciting the weights of all the measures by mean of pairwise comparisons. This approach is based on Analytic Hierarchy Process (AHP), and a set of user-ranked patterns to build a preference matrix, which compares the importance of measures according to the user-specific interestingness. A sensitivity based heuristic is proposed for the active learning mode, in order to insure high quality results with few user ranking queries. Experiments conducted on well-known datasets show that our approach significantly reduces the running time and returns precise pattern ranking, while being robust to user-error compared with state-of-the-art approaches.

Viaarxiv icon

Tractability and Decompositions of Global Cost Functions

Jun 30, 2016
David Allouche, Christian Bessiere, Patrice Boizumault, Simon de Givry, Patricia Gutierrez, Jimmy H. M. Lee, Kam Lun Leung, Samir Loudni, Jean-Philippe Métivier, Thomas Schiex, Yi Wu

Figure 1 for Tractability and Decompositions of Global Cost Functions
Figure 2 for Tractability and Decompositions of Global Cost Functions
Figure 3 for Tractability and Decompositions of Global Cost Functions
Figure 4 for Tractability and Decompositions of Global Cost Functions

Enforcing local consistencies in cost function networks is performed by applying so-called Equivalent Preserving Transformations (EPTs) to the cost functions. As EPTs transform the cost functions, they may break the property that was making local consistency enforcement tractable on a global cost function. A global cost function is called tractable projection-safe when applying an EPT to it is tractable and does not break the tractability property. In this paper, we prove that depending on the size r of the smallest scopes used for performing EPTs, the tractability of global cost functions can be preserved (r = 0) or destroyed (r > 1). When r = 1, the answer is indefinite. We show that on a large family of cost functions, EPTs can be computed via dynamic programming-based algorithms, leading to tractable projection-safety. We also show that when a global cost function can be decomposed into a Berge acyclic network of bounded arity cost functions, soft local consistencies such as soft Directed or Virtual Arc Consistency can directly emulate dynamic programming. These different approaches to decomposable cost functions are then embedded in a solver for extensive experiments that confirm the feasibility and efficiency of our proposal.

* 45 pages for the main paper, extra Appendix with examples of DAG-decomposed global cost functions 
Viaarxiv icon

A global constraint for closed itemset mining

Apr 17, 2016
Mehdi Maamar, Nadjib Lazaar, Samir Loudni, Yahia Lebbah

Figure 1 for A global constraint for closed itemset mining
Figure 2 for A global constraint for closed itemset mining
Figure 3 for A global constraint for closed itemset mining

Discovering the set of closed frequent patterns is one of the fundamental problems in Data Mining. Recent Constraint Programming (CP) approaches for declarative itemset mining have proven their usefulness and flexibility. But the wide use of reified constraints in current CP approaches raises many difficulties to cope with high dimensional datasets. This paper proposes CLOSED PATTERN global constraint which does not require any reified constraints nor any extra variables to encode efficiently the Closed Frequent Pattern Mining (CFPM) constraint. CLOSED-PATTERN captures the particular semantics of the CFPM problem in order to ensure a polynomial pruning algorithm ensuring domain consistency. The computational properties of our constraint are analyzed and their practical effectiveness is experimentally evaluated.

Viaarxiv icon

A global Constraint for mining Sequential Patterns with GAP constraint

Nov 26, 2015
Amina Kemmar, Samir Loudni, Yahia Lebbah, Patrice Boizumault, Thierry Charnois

Figure 1 for A global Constraint for mining Sequential Patterns with GAP constraint
Figure 2 for A global Constraint for mining Sequential Patterns with GAP constraint
Figure 3 for A global Constraint for mining Sequential Patterns with GAP constraint
Figure 4 for A global Constraint for mining Sequential Patterns with GAP constraint

Sequential pattern mining (SPM) under gap constraint is a challenging task. Many efficient specialized methods have been developed but they are all suffering from a lack of genericity. The Constraint Programming (CP) approaches are not so effective because of the size of their encodings. In[7], we have proposed the global constraint Prefix-Projection for SPM which remedies to this drawback. However, this global constraint cannot be directly extended to support gap constraint. In this paper, we propose the global constraint GAP-SEQ enabling to handle SPM with or without gap constraint. GAP-SEQ relies on the principle of right pattern extensions. Experiments show that our approach clearly outperforms both CP approaches and the state-of-the-art cSpade method on large datasets.

Viaarxiv icon

Prefix-Projection Global Constraint for Sequential Pattern Mining

Jun 23, 2015
Amina Kemmar, Samir Loudni, Yahia Lebbah, Patrice Boizumault, Thierry Charnois

Figure 1 for Prefix-Projection Global Constraint for Sequential Pattern Mining
Figure 2 for Prefix-Projection Global Constraint for Sequential Pattern Mining
Figure 3 for Prefix-Projection Global Constraint for Sequential Pattern Mining
Figure 4 for Prefix-Projection Global Constraint for Sequential Pattern Mining

Sequential pattern mining under constraints is a challenging data mining task. Many efficient ad hoc methods have been developed for mining sequential patterns, but they are all suffering from a lack of genericity. Recent works have investigated Constraint Programming (CP) methods, but they are not still effective because of their encoding. In this paper, we propose a global constraint based on the projected databases principle which remedies to this drawback. Experiments show that our approach clearly outperforms CP approaches and competes well with ad hoc methods on large datasets.

Viaarxiv icon

A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database

Nov 27, 2013
Jean-Philippe Métivier, Samir Loudni, Thierry Charnois

Figure 1 for A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database
Figure 2 for A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database
Figure 3 for A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database
Figure 4 for A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database

Constraint-based pattern discovery is at the core of numerous data mining tasks. Patterns are extracted with respect to a given set of constraints (frequency, closedness, size, etc). In the context of sequential pattern mining, a large number of devoted techniques have been developed for solving particular classes of constraints. The aim of this paper is to investigate the use of Constraint Programming (CP) to model and mine sequential patterns in a sequence database. Our CP approach offers a natural way to simultaneously combine in a same framework a large set of constraints coming from various origins. Experiments show the feasibility and the interest of our approach.

Viaarxiv icon

Discovering Knowledge using a Constraint-based Language

Jul 18, 2011
Patrice Boizumault, Bruno Crémilleux, Mehdi Khiari, Samir Loudni, Jean-Philippe Métivier

Figure 1 for Discovering Knowledge using a Constraint-based Language
Figure 2 for Discovering Knowledge using a Constraint-based Language
Figure 3 for Discovering Knowledge using a Constraint-based Language
Figure 4 for Discovering Knowledge using a Constraint-based Language

Discovering pattern sets or global patterns is an attractive issue from the pattern mining community in order to provide useful information. By combining local patterns satisfying a joint meaning, this approach produces patterns of higher level and thus more useful for the data analyst than the usual local patterns, while reducing the number of patterns. In parallel, recent works investigating relationships between data mining and constraint programming (CP) show that the CP paradigm is a nice framework to model and mine such patterns in a declarative and generic way. We present a constraint-based language which enables us to define queries addressing patterns sets and global patterns. The usefulness of such a declarative approach is highlighted by several examples coming from the clustering based on associations. This language has been implemented in the CP framework.

* 12 pages 
Viaarxiv icon