Source author record

Djelloul Ziadi

Djelloul Ziadi appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Formal Languages and Automata Theory Computation and Language Computational Complexity Computational Geometry Databases Machine Learning math.CO

Catalog footprint

What is connected

11works

7topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2020arXiv

Mining Frequent Itemsets: a Formal Unification

It is generally well agreed that developing a unifying theory is one of the most important issues in Data Mining research. In the last two decades, a great deal of work has been devoted to the algorithmic aspects of the Frequent Itemset (FI) Mining problem. We are motivated by the need for formal modeling in the field. Thus, we introduce and analyze, in this theoretical study, a new model for the FI mining task. Indeed, we encode the itemsets as words over an ordered alphabet, and state this problem by a formal series over the counting semiring $(\mathbb{N},+,\times,0,1)$, whose range constitutes the itemsets and the coefficients are their supports. This formalism offers many advantages in both fundamental and practical aspects: the introduction of a clear and unified theoretical framework through which we can express the main FI-approaches, the possibility of their generalization to mine other more complex objects, and their incrementalisation or parallelisation; in practice, we explain how this problem can be seen as that of word recognition by an automaton, allowing an efficient implementation in $O(|Q|)$ space and $O(|\mathcal{F}_L||Q|])$ time, where $Q$ is the set of states of the automaton used for representing the data, and $\mathcal{F}_L$ the set of prefixial longest FI.

preprint2015arXiv

Algorithm for the k-Position Tree Automaton Construction

The word position automaton was introduced by Glushkov and McNaughton in the early 1960. This automaton is homogeneous and has (||\E||+1) states for a word expression of alphabetic width ||\E||. This kind of automata is extended to regular tree expressions. In this paper, we give an efficient algorithm that computes the \Follow sets, which are used in different algorithms of conversion of a regular expression into tree automata. In the following, we consider the k-position tree automaton construction. We prove that for a regular expression \E of a size |\E| and alphabetic width ||\E||, the \Follow sets can be computed in O(||\E||\cdot |\E|) time complexity.

preprint2015arXiv

Bottom Up Quotients and Residuals for Tree Languages

In this paper, we extend the notion of tree language quotients to bottom-up quotients. Instead of computing the residual of a tree language from top to bottom and producing a list of tree languages, we show how to compute a set of k-ary trees, where k is an arbitrary integer. We define the quotient formula for different combinations of tree languages: union, symbol products, compositions, iterated symbol products and iterated composition. These computations lead to the definition of the bottom-up quotient tree automaton, that turns out to be the minimal deterministic tree automaton associated with a regular tree language in the case of the 0-ary trees.

preprint2015arXiv

Construction of rational expression from tree automata using a generalization of Arden's Lemma

Arden's Lemma is a classical result in language theory allowing the computation of a rational expression denoting the language recognized by a finite string automaton. In this paper we generalize this important lemma to the rational tree languages. Moreover, we propose also a construction of a rational tree expression which denotes the accepted tree language of a finite tree automaton.

preprint2015arXiv

Efficient Geometric-based Computation of the String Subsequence Kernel

Kernel methods are powerful tools in machine learning. They have to be computationally efficient. In this paper, we present a novel Geometric-based approach to compute efficiently the string subsequence kernel (SSK). Our main idea is that the SSK computation reduces to range query problem. We started by the construction of a match list $L(s,t)=\{(i,j):s_{i}=t_{j}\}$ where $s$ and $t$ are the strings to be compared; such match list contains only the required data that contribute to the result. To compute efficiently the SSK, we extended the layered range tree data structure to a layered range sum tree, a range-aggregation data structure. The whole process takes $ O(p|L|\log|L|)$ time and $O(|L|\log|L|)$ space, where $|L|$ is the size of the match list and $p$ is the length of the SSK. We present empiric evaluations of our approach against the dynamic and the sparse programming approaches both on synthetically generated data and on newswire article data. Such experiments show the efficiency of our approach for large alphabet size except for very short strings. Moreover, compared to the sparse dynamic approach, the proposed approach outperforms absolutely for long strings.

preprint2015arXiv

Rational Kernels for Arabic Stemming and Text Classification

In this paper, we address the problems of Arabic Text Classification and stemming using Transducers and Rational Kernels. We introduce a new stemming technique based on the use of Arabic patterns (Pattern Based Stemmer). Patterns are modelled using transducers and stemming is done without depending on any dictionary. Using transducers for stemming, documents are transformed into finite state transducers. This document representation allows us to use and explore rational kernels as a framework for Arabic Text Classification. Stemming experiments are conducted on three word collections and classification experiments are done on the Saudi Press Agency dataset. Results show that our approach, when compared with other approaches, is promising specially in terms of Accuracy, Recall and F1.

preprint2015arXiv

Root-Weighted Tree Automata and their Applications to Tree Kernels

In this paper, we define a new kind of weighted tree automata where the weights are only supported by final states. We show that these automata are sequentializable and we study their closures under classical regular and algebraic operations. We then use these automata to compute the subtree kernel of two finite tree languages in an efficient way. Finally, we present some perspectives involving the root-weighted tree automata.

preprint2014arXiv

An Efficient Algorithm for the Equation Tree Automaton via the $k$-C-Continuations

Champarnaud and Ziadi, and Khorsi et al. show how to compute the equation automaton of word regular expression $E$ via the $k$-C-Continuations. Kuske and Meinecke extend the computation of the equation automaton to a regular tree expression $E$ over a ranked alphabet $Σ$ and produce a $O(R\cdot|E|^2)$ time and space complexity algorithm, where $R$ is the maximal rank of a symbol occurring in $Σ$ and $|E|$ is the size of $E$. In this paper, we give a full description of the algorithm based on the acyclic minimization of Revuz. Our algorithm, which is performed in an $O(|Q|\cdot|E|)$ time and space complexity, where $|Q|$ is the number of states of the produced automaton, is more efficient than the one obtained by Kuske and Meinecke.

preprint2014arXiv

K-Position, Follow, Equation and K-C-Continuation Tree Automata Constructions

There exist several methods of computing an automaton recognizing the language denoted by a given regular expression: In the case of words, the position automaton P due to Glushkov, the c-continuation automaton C due to Champarnaud and Ziadi, the follow automaton F due to Ilie and Yu and the equation automaton E due to Antimirov. It has been shown that P and C are isomorphic and that E (resp. F) is a quotient of C (resp. of P). In this paper, we define from a given regular tree expression the k-position tree automaton P and the follow tree automaton F . Using the definition of the equation tree automaton E of Kuske and Meinecke and our previously defined k-C-continuation tree automaton C, we show that the previous morphic relations are still valid on tree expressions.

preprint2013arXiv

Extended to Multi-Tilde-Bar Regular Expressions and Efficient Finite Automata Constructions

Several algorithms have been designed to convert a regular expression into an equivalent finite automaton. One of the most popular constructions, due to Glushkov and to McNaughton and Yamada, is based on the computation of the Null, First, Last and Follow sets (called Glushkov functions) associated with a linearized version of the expression. Recently Mignot considered a family of extended expressions called Extended to multi-tilde-bar Regular Expressions (EmtbREs) and he showed that, under some restrictions, Glushkov functions can be defined for an EmtbRE. In this paper we present an algorithm which efficiently computes the Glushkov functions of an unrestricted EmtbRE. Our approach is based on a recursive definition of the language associated with an EmtbRE which enlightens the fact that the worst case time complexity of the conversion of an EmtbRE into an automaton is related to the worst case time complexity of the computation of the Null function. Finally we show how to extend the ZPC-structure to EmtbREs, which allows us to apply to this family of extended expressions the efficient constructions based on this structure (in particular the construction of the c-continuation automaton, the position automaton, the follow automaton and the equation automaton).

preprint2013arXiv

On the transition reduction problem for finite automata

We are interested in the problem of transition reduction of nondeterministic automata. We present some results on the reduction of the automata recognizing the language $L(E_n)$ denoted by the regular expression $E_n=(1+\varepsilon)... (2+\varepsilon)... (3+\varepsilon)... (n+\varepsilon)$. These results can be used in the general case of the transition reduction problem.

Djelloul Ziadi

What is connected

Connect this record

See the researcher in context

Building this map preview

11 published item(s)

Mining Frequent Itemsets: a Formal Unification

Algorithm for the k-Position Tree Automaton Construction

Bottom Up Quotients and Residuals for Tree Languages

Construction of rational expression from tree automata using a generalization of Arden's Lemma

Efficient Geometric-based Computation of the String Subsequence Kernel

Rational Kernels for Arabic Stemming and Text Classification

Root-Weighted Tree Automata and their Applications to Tree Kernels

An Efficient Algorithm for the Equation Tree Automaton via the $k$-C-Continuations

K-Position, Follow, Equation and K-C-Continuation Tree Automata Constructions

Extended to Multi-Tilde-Bar Regular Expressions and Efficient Finite Automata Constructions

On the transition reduction problem for finite automata