Researcher profile

Hadda Cherroun

Hadda Cherroun contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
6topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2020arXiv

Mining Frequent Itemsets: a Formal Unification

It is generally well agreed that developing a unifying theory is one of the most important issues in Data Mining research. In the last two decades, a great deal of work has been devoted to the algorithmic aspects of the Frequent Itemset (FI) Mining problem. We are motivated by the need for formal modeling in the field. Thus, we introduce and analyze, in this theoretical study, a new model for the FI mining task. Indeed, we encode the itemsets as words over an ordered alphabet, and state this problem by a formal series over the counting semiring $(\mathbb{N},+,\times,0,1)$, whose range constitutes the itemsets and the coefficients are their supports. This formalism offers many advantages in both fundamental and practical aspects: the introduction of a clear and unified theoretical framework through which we can express the main FI-approaches, the possibility of their generalization to mine other more complex objects, and their incrementalisation or parallelisation; in practice, we explain how this problem can be seen as that of word recognition by an automaton, allowing an efficient implementation in $O(|Q|)$ space and $O(|\mathcal{F}_L||Q|])$ time, where $Q$ is the set of states of the automaton used for representing the data, and $\mathcal{F}_L$ the set of prefixial longest FI.

preprint2015arXiv

Construction of rational expression from tree automata using a generalization of Arden's Lemma

Arden's Lemma is a classical result in language theory allowing the computation of a rational expression denoting the language recognized by a finite string automaton. In this paper we generalize this important lemma to the rational tree languages. Moreover, we propose also a construction of a rational tree expression which denotes the accepted tree language of a finite tree automaton.

preprint2015arXiv

Efficient Geometric-based Computation of the String Subsequence Kernel

Kernel methods are powerful tools in machine learning. They have to be computationally efficient. In this paper, we present a novel Geometric-based approach to compute efficiently the string subsequence kernel (SSK). Our main idea is that the SSK computation reduces to range query problem. We started by the construction of a match list $L(s,t)=\{(i,j):s_{i}=t_{j}\}$ where $s$ and $t$ are the strings to be compared; such match list contains only the required data that contribute to the result. To compute efficiently the SSK, we extended the layered range tree data structure to a layered range sum tree, a range-aggregation data structure. The whole process takes $ O(p|L|\log|L|)$ time and $O(|L|\log|L|)$ space, where $|L|$ is the size of the match list and $p$ is the length of the SSK. We present empiric evaluations of our approach against the dynamic and the sparse programming approaches both on synthetically generated data and on newswire article data. Such experiments show the efficiency of our approach for large alphabet size except for very short strings. Moreover, compared to the sparse dynamic approach, the proposed approach outperforms absolutely for long strings.

preprint2015arXiv

Rational Kernels for Arabic Stemming and Text Classification

In this paper, we address the problems of Arabic Text Classification and stemming using Transducers and Rational Kernels. We introduce a new stemming technique based on the use of Arabic patterns (Pattern Based Stemmer). Patterns are modelled using transducers and stemming is done without depending on any dictionary. Using transducers for stemming, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Stemming experiments are conducted on three word collections and classification experiments are done on the Saudi Press Agency dataset. Results show that our approach, when compared with other approaches, is promising specially in terms of Accuracy, Recall and F1.