Source author record

Stefan Walzer

Stefan Walzer appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms math.CO Databases Discrete Mathematics

Catalog footprint

What is connected

5works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

Fast Succinct Retrieval and Approximate Membership using Ribbon

A retrieval data structure for a static function $f:S\rightarrow \{0,1\}^r$ supports queries that return $f(x)$ for any $x \in S$. Retrieval data structures can be used to implement a static approximate membership query data structure (AMQ), i.e., a Bloom filter alternative, with false positive rate $2^{-r}$. The information-theoretic lower bound for both tasks is $r|S|$ bits. While succinct theoretical constructions using $(1+o(1))r|S|$ bits were known, these could not achieve very small overheads in practice because they have an unfavorable space--time tradeoff hidden in the asymptotic costs or because small overheads would only be reached for physically impossible input sizes. With bumped ribbon retrieval (BuRR), we present the first practical succinct retrieval data structure. In an extensive experimental evaluation BuRR achieves space overheads well below 1\,\% while being faster than most previously used retrieval data structures (typically with space overheads at least an order of magnitude larger) and faster than classical Bloom filters (with space overhead $\geq 44\,\%$). This efficiency, including favorable constants, stems from a combination of simplicity, word parallelism, and high locality. We additionally describe homogeneous ribbon filter AMQs, which are even simpler and faster at the price of slightly larger space overhead.

preprint2022arXiv

Insertion Time of Random Walk Cuckoo Hashing below the Peeling Threshold

Most hash tables have an insertion time of $O(1)$, possibly qualified as expected and/or amortised. While insertions into cuckoo hash tables indeed seem to take $O(1)$ expected time in practice, only polylogarithmic guarantees are proven in all but the simplest of practically relevant cases. Given the widespread use of cuckoo hashing to implement compact dictionaries and Bloom filter alternatives, closing this gap is an important open problem for theoreticians. In this paper, we show that random walk insertions into cuckoo hash tables take $O(1)$ expected amortised time when any number $k \geq 3$ of hash functions is used and the load factor is below the corresponding peeling threshold (e.g. $\approx 0.81$ for $k = 3$). To our knowledge, this is the first meaningful guarantee for constant time insertion for cuckoo hashing that works for $k \in \{3,\dots,9\}$. In addition to being useful in its own right, we hope that our key-centred analysis method can be a stepping stone on the path to the true end goal: $O(1)$ time insertions for all load factors below the load threshold (e.g. $\approx 0.91$ for $k = 3$).

preprint2021arXiv

Ribbon filter: practically smaller than Bloom and Xor

Filter data structures over-approximate a set of hashable keys, i.e. set membership queries may incorrectly come out positive. A filter with false positive rate $f \in (0,1]$ is known to require $\ge \log_2(1/f)$ bits per key. At least for larger $f \ge 2^{-4}$, existing practical filters require a space overhead of at least 20% with respect to this information-theoretic bound. We introduce the Ribbon filter: a new filter for static sets with a broad range of configurable space overheads and false positive rates with competitive speed over that range, especially for larger $f \ge 2^{-7}$. In many cases, Ribbon is faster than existing filters for the same space overhead, or can achieve space overhead below 10% with some additional CPU time. An experimental Ribbon design with load balancing can even achieve space overheads below 1%. A Ribbon filter resembles an Xor filter modified to maximize locality and is constructed by solving a band-like linear system over Boolean variables. In previous work, Dietzfelbinger and Walzer describe this linear system and an efficient Gaussian solver. We present and analyze a faster, more adaptable solving process we call "Rapid Incremental Boolean Banding ON the fly," which resembles hash table construction. We also present and analyze an attractive Ribbon variant based on making the linear system homogeneous, and describe several more practical enhancements.

preprint2015arXiv

Boolean lattices: Ramsey properties and embeddings

A subposet $Q'$ of a poset $Q$ is a copy of a poset $P$ if there is a bijection $f$ between elements of $P$ and $Q'$ such that $x\leq y$ in $P$ iff $f(x)\leq f(y)$ in $Q'$. For posets $P, P'$, let the poset Ramsey number $R(P,P')$ be the smallest $N$ such that no matter how the elements of the Boolean lattice $Q_N$ are colored red and blue, there is a copy of $P$ with all red elements or a copy of $P'$ with all blue elements. We provide some general bounds on $R(P,P')$ and focus on the situation when $P$ and $P'$ are both Boolean lattices. In addition, we give asymptotically tight bounds for the number of copies of $Q_n$ in $Q_N$ and for a multicolor version of a poset Ramsey number.

preprint2014arXiv

Playing weighted Tron on Trees

We consider the weighted version of the Tron game on graphs where two players, Alice and Bob, each build their own path by claiming one vertex at a time, starting with Alice. The vertices carry non-negative weights that sum up to 1 and either player tries to claim a path with larger total weight than the opponent. We show that if the graph is a tree then Alice can always ensure to get at most 1/5 less than Bob, and that there exist trees where Bob can ensure to get at least 1/5 more than Alice.