Source author record

Alberto Ordóñez

Alberto Ordóñez appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms

Catalog footprint

What is connected

3works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

Efficient and Compact Representations of Prefix Codes

Most of the attention in statistical compression is given to the space used by the compressed sequence, a problem completely solved with optimal prefix codes. However, in many applications, the storage space used to represent the prefix code itself can be an issue. In this paper we introduce and compare several techniques to store prefix codes. Let $N$ be the sequence length and $n$ be the alphabet size. Then a naive storage of an optimal prefix code uses $O(n\log n)$ bits. Our first technique shows how to use $O(n\log\log(N/n))$ bits to store the optimal prefix code. Then we introduce an approximate technique that, for any $0<ε<1/2$, takes $O(n \log \log (1 / ε))$ bits to store a prefix code with average codeword length within an additive $ε$ of the minimum. Finally, a second approximation takes, for any constant $c > 1$, $O(n^{1 / c} \log n)$ bits to store a prefix code with average codeword length at most $c$ times the minimum. In all cases, our data structures allow encoding and decoding of any symbol in $O(1)$ time. We experimentally compare our new techniques with the state of the art, showing that we achieve 6--8-fold space reductions, at the price of a slower encoding (2.5--8 times slower) and decoding (12--24 times slower). The approximations further reduce this space and improve the time significantly, up to recovering the speed of classical implementations, for a moderate penalty in the average code length. As a byproduct, we compare various heuristic, approximate, and optimal algorithms to generate length-restricted codes, showing that the optimal ones are clearly superior and practical enough to be implemented.

preprint2014arXiv

Efficient Compressed Wavelet Trees over Large Alphabets

The {\em wavelet tree} is a flexible data structure that permits representing sequences $S[1,n]$ of symbols over an alphabet of size $σ$, within compressed space and supporting a wide range of operations on $S$. When $σ$ is significant compared to $n$, current wavelet tree representations incur in noticeable space or time overheads. In this article we introduce the {\em wavelet matrix}, an alternative representation for large alphabets that retains all the properties of wavelet trees but is significantly faster. We also show how the wavelet matrix can be compressed up to the zero-order entropy of the sequence without sacrificing, and actually improving, its time performance. Our experimental results show that the wavelet matrix outperforms all the wavelet tree variants along the space/time tradeoff map.

preprint2014arXiv

Queries on LZ-Bounded Encodings

We describe a data structure that stores a string $S$ in space similar to that of its Lempel-Ziv encoding and efficiently supports access, rank and select queries. These queries are fundamental for implementing succinct and compressed data structures, such as compressed trees and graphs. We show that our data structure can be built in a scalable manner and is both small and fast in practice compared to other data structures supporting such queries.