Researcher profile

Francisco Claude

Francisco Claude contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 17 - UnverifiedVerification L1Unclaimed author
4works
0followers
1topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

4 published item(s)

preprint2020arXiv

Grammar-Compressed Indexes with Logarithmic Search Time

Let a text $T[1..n]$ be the only string generated by a context-free grammar with $g$ (terminal and nonterminal) symbols, and of size $G$ (measured as the sum of the lengths of the right-hand sides of the rules). Such a grammar, called a grammar-compressed representation of $T$, can be encoded using essentially $G\lg g$ bits. We introduce the first grammar-compressed index that uses $O(G\lg n)$ bits and can find the $occ$ occurrences of patterns $P[1..m]$ in time $O((m^2+occ)\lg G)$. We implement the index and demonstrate its practicality in comparison with the state of the art, on highly repetitive text collections.

preprint2012arXiv

Compact Binary Relation Representations with Rich Functionality

Binary relations are an important abstraction arising in many data representation problems. The data structures proposed so far to represent them support just a few basic operations required to fit one particular application. We identify many of those operations arising in applications and generalize them into a wide set of desirable queries for a binary relation representation. We also identify reductions among those operations. We then introduce several novel binary relation representations, some simple and some quite sophisticated, that not only are space-efficient but also efficiently support a large subset of the desired queries.

preprint2012arXiv

Efficient Fully-Compressed Sequence Representations

We present a data structure that stores a sequence $s[1..n]$ over alphabet $[1..σ]$ in $n\Ho(s) + o(n)(\Ho(s){+}1)$ bits, where $\Ho(s)$ is the zero-order entropy of $s$. This structure supports the queries \access, \rank\ and \select, which are fundamental building blocks for many other compressed data structures, in worst-case time $\Oh{\lg\lgσ}$ and average time $\Oh{\lg \Ho(s)}$. The worst-case complexity matches the best previous results, yet these had been achieved with data structures using $n\Ho(s)+o(n\lgσ)$ bits. On highly compressible sequences the $o(n\lgσ)$ bits of the redundancy may be significant compared to the the $n\Ho(s)$ bits that encode the data. Our representation, instead, compresses the redundancy as well. Moreover, our average-case complexity is unprecedented. Our technique is based on partitioning the alphabet into characters of similar frequency. The subsequence corresponding to each group can then be encoded using fast uncompressed representations without harming the overall compression ratios, even in the redundancy. The result also improves upon the best current compressed representations of several other data structures. For example, we achieve $(i)$ compressed redundancy, retaining the best time complexities, for the smallest existing full-text self-indexes; $(ii)$ compressed permutations $π$ with times for $π()$ and $\pii()$ improved to loglogarithmic; and $(iii)$ the first compressed representation of dynamic collections of disjoint sets. We also point out various applications to inverted indexes, suffix arrays, binary relations, and data compressors. ...

preprint2011arXiv

Improved Grammar-Based Compressed Indexes

We introduce the first grammar-compressed representation of a sequence that supports searches in time that depends only logarithmically on the size of the grammar. Given a text $T[1..u]$ that is represented by a (context-free) grammar of $n$ (terminal and nonterminal) symbols and size $N$ (measured as the sum of the lengths of the right hands of the rules), a basic grammar-based representation of $T$ takes $N\lg n$ bits of space. Our representation requires $2N\lg n + N\lg u + ε\, n\lg n + o(N\lg n)$ bits of space, for any $0<ε\le 1$. It can find the positions of the $occ$ occurrences of a pattern of length $m$ in $T$ in $O((m^2/ε)\lg (\frac{\lg u}{\lg n}) +occ\lg n)$ time, and extract any substring of length $\ell$ of $T$ in time $O(\ell+h\lg(N/h))$, where $h$ is the height of the grammar tree.