Source author record

Jeremy Barbay

Jeremy Barbay appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms

Catalog footprint

What is connected

2works

1topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2015arXiv

Refining the Analysis of Divide and Conquer: How and When

Divide-and-conquer is a central paradigm for the design of algorithms, through which some fundamental computational problems, such as sorting arrays and computing convex hulls, are solved in optimal time within $Θ(n\log{n})$ in the worst case over instances of size $n$. A finer analysis of those problems yields complexities within $O(n(1 + \mathcal{H}(n_1, \dots, n_k))) \subseteq O(n(1{+}\log{k})) \subseteq O(n\log{n})$ in the worst case over all instances of size $n$ composed of $k$ "easy" fragments of respective sizes $n_1, \dots, n_k$ summing to $n$, where the entropy function $\mathcal{H}(n_1, \dots, n_k) = \sum_{i=1}^k{\frac{n_i}{n}}\log{\frac{n}{n_i}}$ measures the "difficulty" of the instance. We consider whether such refined analysis can be applied to other algorithms based on divide-and-conquer, such as polynomial multiplication, input-order adaptive computation of convex hulls in 2D and 3D, and computation of Delaunay triangulations.

preprint2012arXiv

Efficient Fully-Compressed Sequence Representations

We present a data structure that stores a sequence $s[1..n]$ over alphabet $[1..σ]$ in $n\Ho(s) + o(n)(\Ho(s){+}1)$ bits, where $\Ho(s)$ is the zero-order entropy of $s$. This structure supports the queries \access, \rank\ and \select, which are fundamental building blocks for many other compressed data structures, in worst-case time $\Oh{\lg\lgσ}$ and average time $\Oh{\lg \Ho(s)}$. The worst-case complexity matches the best previous results, yet these had been achieved with data structures using $n\Ho(s)+o(n\lgσ)$ bits. On highly compressible sequences the $o(n\lgσ)$ bits of the redundancy may be significant compared to the the $n\Ho(s)$ bits that encode the data. Our representation, instead, compresses the redundancy as well. Moreover, our average-case complexity is unprecedented. Our technique is based on partitioning the alphabet into characters of similar frequency. The subsequence corresponding to each group can then be encoded using fast uncompressed representations without harming the overall compression ratios, even in the redundancy. The result also improves upon the best current compressed representations of several other data structures. For example, we achieve $(i)$ compressed redundancy, retaining the best time complexities, for the smallest existing full-text self-indexes; $(ii)$ compressed permutations $π$ with times for $π()$ and $\pii()$ improved to loglogarithmic; and $(iii)$ the first compressed representation of dynamic collections of disjoint sets. We also point out various applications to inverted indexes, suffix arrays, binary relations, and data compressors. ...