Source author record

Przemysław Uznański

Przemysław Uznański appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Data Structures and Algorithms Distributed, Parallel, and Cluster Computing Computational Complexity Discrete Mathematics

Catalog footprint

What is connected

14works

4topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2022arXiv

A time and space optimal stable population protocol solving exact majority

We study population protocols, a model of distributed computing appropriate for modeling well-mixed chemical reaction networks and other physical systems where agents exchange information in pairwise interactions, but have no control over their schedule of interaction partners. The well-studied *majority* problem is that of determining in an initial population of $n$ agents, each with one of two opinions $A$ or $B$, whether there are more $A$, more $B$, or a tie. A *stable* protocol solves this problem with probability 1 by eventually entering a configuration in which all agents agree on a correct consensus decision of $\mathsf{A}$, $\mathsf{B}$, or $\mathsf{T}$, from which the consensus cannot change. We describe a protocol that solves this problem using $O(\log n)$ states ($\log \log n + O(1)$ bits of memory) and optimal expected time $O(\log n)$. The number of states $O(\log n)$ is known to be optimal for the class of polylogarithmic time stable protocols that are "output dominant" and "monotone". These are two natural constraints satisfied by our protocol, making it simultaneously time- and state-optimal for that class. We introduce a key technique called a "fixed resolution clock" to achieve partial synchronization. Our protocol is *nonuniform*: the transition function has the value $\left \lceil {\log n} \right \rceil$ encoded in it. We show that the protocol can be modified to be uniform, while increasing the state complexity to $Θ(\log n \log \log n)$.

preprint2022arXiv

Robust Comparison in Population Protocols

There has recently been a surge of interest in the computational and complexity properties of the population model, which assumes $n$ anonymous, computationally-bounded nodes, interacting at random, and attempting to jointly compute global predicates. Significant work has gone towards investigating majority and consensus dynamics in this model: assuming that each node is initially in one of two states $X$ or $Y$, determine which state had higher initial count. In this paper, we consider a natural generalization of majority/consensus, which we call comparison. We are given two baseline states, $X_0$ and $Y_0$, present in any initial configuration in fixed, possibly small counts. Importantly, one of these states has higher count than the other: we will assume $|X_0| \ge C |Y_0|$ for some constant $C$. The challenge is to design a protocol which can quickly and reliably decide on which of the baseline states $X_0$ and $Y_0$ has higher initial count. We propose a simple algorithm solving comparison: the baseline algorithm uses $O(\log n)$ states per node, and converges in $O(\log n)$ (parallel) time, with high probability, to a state where whole population votes on opinions $X$ or $Y$ at rates proportional to initial $|X_0|$ vs. $|Y_0|$ concentrations. We then describe how such output can be then used to solve comparison. The algorithm is self-stabilizing, in the sense that it converges to the correct decision even if the relative counts of baseline states $X_0$ and $Y_0$ change dynamically during the execution, and leak-robust, in the sense that it can withstand spurious faulty reactions. Our analysis relies on a new martingale concentration result which relates the evolution of a population protocol to its expected (steady-state) analysis, which should be broadly applicable in the context of population protocols and opinion dynamics.

preprint2022arXiv

The Dynamic k-Mismatch Problem

The text-to-pattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length $m$ and all length-$m$ substrings of a given text of length $n\ge m$. We focus on the $k$-mismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold $k$. We assume $n\le 2m$ (in general, one can partition the text into overlapping blocks). In this work, we show data structures for the dynamic version of this problem supporting two operations: An update performs a single-letter substitution in the pattern or the text, and a query, given an index $i$, returns the Hamming distance between the pattern and the text substring starting at position $i$, or reports that it exceeds $k$. First, we show a data structure with $\tilde{O}(1)$ update and $\tilde{O}(k)$ query time. Then we show that $\tilde{O}(k)$ update and $\tilde{O}(1)$ query time is also possible. These two provide an optimal trade-off for the dynamic $k$-mismatch problem with $k \le \sqrt{n}$: we prove that, conditioned on the strong 3SUM conjecture, one cannot simultaneously achieve $k^{1-Ω(1)}$ time for all operations. For $k\ge \sqrt{n}$, we give another lower bound, conditioned on the Online Matrix-Vector conjecture, that excludes algorithms taking $n^{1/2-Ω(1)}$ time per operation. This is tight for constant-sized alphabets: Clifford et al. (STACS 2018) achieved $\tilde{O}(\sqrt{n})$ time per operation in that case, but with $\tilde{O}(n^{3/4})$ time per operation for large alphabets. We improve and extend this result with an algorithm that, given $1\le x\le k$, achieves update time $\tilde{O}(\frac{n}{k} +\sqrt{\frac{nk}{x}})$ and query time $\tilde{O}(x)$. In particular, for $k\ge \sqrt{n}$, an appropriate choice of $x$ yields $\tilde{O}(\sqrt[3]{nk})$ time per operation, which is $\tilde{O}(n^{2/3})$ when no threshold $k$ is provided.

preprint2020arXiv

An Efficient Noisy Binary Search in Graphs via Median Approximation

Consider a generalization of the classical binary search problem in linearly sorted data to the graph-theoretic setting. The goal is to design an adaptive query algorithm, called a strategy, that identifies an initially unknown target vertex in a graph by asking queries. Each query is conducted as follows: the strategy selects a vertex $q$ and receives a reply $v$: if $q$ is the target, then $v=q$, and if $q$ is not the target, then $v$ is a neighbor of $q$ that lies on a shortest path to the target. Furthermore, there is a noise parameter $0\leq p<\frac{1}{2}$, which means that each reply can be incorrect with probability $p$. The optimization criterion to be minimized is the overall number of queries asked by the strategy, called the query complexity. The query complexity is well understood to be $O(\varepsilon^{-2}\log n)$ for general graphs, where $n$ is the order of the graph and $\varepsilon=\frac{1}{2}-p$. However, implementing such a strategy is computationally expensive, with each query requiring possibly $O(n^2)$ operations. In this work we propose two efficient strategies that keep the optimal query complexity. The first strategy achieves the overall complexity of $O(\varepsilon^{-1}n\log n)$ per a single query. The second strategy is dedicated to graphs of small diameter $D$ and maximum degree $Δ$ and has the average complexity of $O(n+\varepsilon^{-2}DΔ\log n)$ per query. We stress out that we develop an algorithmic tool of graph median approximation that is of independent interest: the median can be efficiently approximated by finding a vertex minimizing the sum of distances to a randomly sampled vertex subset of size $O(\varepsilon^{-2}\log n)$.

preprint2020arXiv

Approximating Text-to-Pattern Distance via Dimensionality Reduction

Text-to-pattern distance is a fundamental problem in string matching, where given a pattern of length $m$ and a text of length $n$, over an integer alphabet, we are asked to compute the distance between pattern and the text at every location. The distance function can be e.g. Hamming distance or $\ell_p$ distance for some parameter $p > 0$. Almost all state-of-the-art exact and approximate algorithms developed in the past $\sim 40$ years were using FFT as a black-box. In this work we present $\widetilde{O}(n/\varepsilon^2)$ time algorithms for $(1\pm\varepsilon)$-approximation of $\ell_2$ distances, and $\widetilde{O}(n/\varepsilon^3)$ algorithm for approximation of Hamming and $\ell_1$ distances, all without use of FFT. This is independent to the very recent development by Chan et al. [STOC 2020], where $O(n/\varepsilon^2)$ algorithm for Hamming distances not using FFT was presented -- although their algorithm is much more "combinatorial", our techniques apply to other norms than Hamming.

preprint2020arXiv

Cardinality estimation using Gumbel distribution

Cardinality estimation is the task of approximating the number of distinct elements in a large dataset with possibly repeating elements. LogLog and HyperLogLog (c.f. Durand and Flajolet [ESA 2003], Flajolet et al. [Discrete Math Theor. 2007]) are small space sketching schemes for cardinality estimation, which have both strong theoretical guarantees of performance and are highly effective in practice. This makes them a highly popular solution with many implementations in big-data systems (e.g. Algebird, Apache DataSketches, BigQuery, Presto and Redis). However, despite having simple and elegant formulation, both the analysis of LogLog and HyperLogLog are extremely involved -- spanning over tens of pages of analytic combinatorics and complex function analysis. We propose a modification to both LogLog and HyperLogLog that replaces discrete geometric distribution with a continuous Gumbel distribution. This leads to a very short, simple and elementary analysis of estimation guarantees, and smoother behavior of the estimator.

preprint2020arXiv

Improved Circular $k$-Mismatch Sketches

The shift distance $\mathsf{sh}(S_1,S_2)$ between two strings $S_1$ and $S_2$ of the same length is defined as the minimum Hamming distance between $S_1$ and any rotation (cyclic shift) of $S_2$. We study the problem of sketching the shift distance, which is the following communication complexity problem: Strings $S_1$ and $S_2$ of length $n$ are given to two identical players (encoders), who independently compute sketches (summaries) $\mathtt{sk}(S_1)$ and $\mathtt{sk}(S_2)$, respectively, so that upon receiving the two sketches, a third player (decoder) is able to compute (or approximate) $\mathsf{sh}(S_1,S_2)$ with high probability. This paper primarily focuses on the more general $k$-mismatch version of the problem, where the decoder is allowed to declare a failure if $\mathsf{sh}(S_1,S_2)>k$, where $k$ is a parameter known to all parties. Andoni et al. (STOC'13) introduced exact circular $k$-mismatch sketches of size $\widetilde{O}(k+D(n))$, where $D(n)$ is the number of divisors of $n$. Andoni et al. also showed that their sketch size is optimal in the class of linear homomorphic sketches. We circumvent this lower bound by designing a (non-linear) exact circular $k$-mismatch sketch of size $\widetilde{O}(k)$; this size matches communication-complexity lower bounds. We also design $(1\pm \varepsilon)$-approximate circular $k$-mismatch sketch of size $\widetilde{O}(\min(\varepsilon^{-2}\sqrt{k}, \varepsilon^{-1.5}\sqrt{n}))$, which improves upon an $\widetilde{O}(\varepsilon^{-2}\sqrt{n})$-size sketch of Crouch and McGregor (APPROX'11).

preprint2016arXiv

A note on distance labeling in planar graphs

A distance labeling scheme is an assignments of labels, that is binary strings, to all nodes of a graph, so that the distance between any two nodes can be computed from their labels and the labels are as short as possible. A major open problem is to determine the complexity of distance labeling in unweighted and undirected planar graphs. It is known that, in such a graph on $n$ nodes, some labels must consist of $Ω(n^{1/3})$ bits, but the best known labeling scheme uses labels of length $O(\sqrt{n}\log n)$ [Gavoille, Peleg, Pérennes, and Raz, J. Algorithms, 2004]. We show that, in fact, labels of length $O(\sqrt{n})$ are enough.

preprint2016arXiv

Randomized algorithms for finding a majority element

Given $n$ colored balls, we want to detect if more than $\lfloor n/2\rfloor$ of them have the same color, and if so find one ball with such majority color. We are only allowed to choose two balls and compare their colors, and the goal is to minimize the total number of such operations. A well-known exercise is to show how to find such a ball with only $2n$ comparisons while using only a logarithmic number of bits for bookkeeping. The resulting algorithm is called the Boyer--Moore majority vote algorithm. It is known that any deterministic method needs $\lceil 3n/2\rceil-2$ comparisons in the worst case, and this is tight. However, it is not clear what is the required number of comparisons if we allow randomization. We construct a randomized algorithm which always correctly finds a ball of the majority color (or detects that there is none) using, with high probability, only $7n/6+o(n)$ comparisons. We also prove that the expected number of comparisons used by any such randomized method is at least $1.019n$.

preprint2016arXiv

Sublinear-Space Distance Labeling using Hubs

A distance labeling scheme is an assignment of bit-labels to the vertices of an undirected, unweighted graph such that the distance between any pair of vertices can be decoded solely from their labels. We propose a series of new labeling schemes within the framework of so-called hub labeling (HL, also known as landmark labeling or 2-hop-cover labeling), in which each node $u$ stores its distance to all nodes from an appropriately chosen set of hubs $S(u) \subseteq V$. For a queried pair of nodes $(u,v)$, the length of a shortest $u-v$-path passing through a hub node from $S(u)\cap S(v)$ is then used as an upper bound on the distance between $u$ and $v$. We present a hub labeling which allows us to decode exact distances in sparse graphs using labels of size sublinear in the number of nodes. For graphs with at most $n$ nodes and average degree $Δ$, the tradeoff between label bit size $L$ and query decoding time $T$ for our approach is given by $L = O(n \log \log_ΔT / \log_ΔT)$, for any $T \leq n$. Our simple approach is thus the first sublinear-space distance labeling for sparse graphs that simultaneously admits small decoding time (for constant $Δ$, we can achieve any $T=ω(1)$ while maintaining $L=o(n)$), and it also provides an improvement in terms of label size with respect to previous slower approaches. By using similar techniques, we then present a $2$-additive labeling scheme for general graphs, i.e., one in which the decoder provides a 2-additive-approximation of the distance between any pair of nodes. We achieve almost the same label size-time tradeoff $L = O(n \log^2 \log T / \log T)$, for any $T \leq n$. To our knowledge, this is the first additive scheme with constant absolute error to use labels of sublinear size. The corresponding decoding time is then small (any $T=ω(1)$ is sufficient).

preprint2016arXiv

Tight tradeoffs for approximating palindromes in streams

We consider computing the longest palindrome in a text of length $n$ in the streaming model, where the characters arrive one-by-one, and we do not have random access to the input. While computing the answer exactly using sublinear memory is not possible in such a setting, one can still hope for a good approximation guarantee. We focus on the two most natural variants, where we aim for either additive or multiplicative approximation of the length of the longest palindrome. We first show that there is no point in considering Las Vegas algorithms in such a setting, as they cannot achieve sublinear space complexity. For Monte Carlo algorithms, we provide a lowerbound of $Ω(\frac{n}{E})$ bits for approximating the answer with additive error $E$, and $Ω(\frac{\log n}{\log(1+\varepsilon)})$ bits for approximating the answer with multiplicative error $(1+\varepsilon)$ for the binary alphabet. Then, we construct a generic Monte Carlo algorithm, which by choosing the parameters appropriately achieves space complexity matching up to a logarithmic factor for both variants. This substantially improves the previous results by Berenbrink et al. (STACS 2014) and essentially settles the space complexity.

preprint2016arXiv

Tight Tradeoffs for Real-Time Approximation of Longest Palindromes in Streams

We consider computing a longest palindrome in the streaming model, where the symbols arrive one-by-one and we do not have random access to the input. While computing the answer exactly using sublinear space is not possible in such a setting, one can still hope for a good approximation guarantee. Our contribution is twofold. First, we provide lower bounds on the space requirements for randomized approximation algorithms processing inputs of length $n$. We rule out Las Vegas algorithms, as they cannot achieve sublinear space complexity. For Monte Carlo algorithms, we prove a lower bounds of $Ω( M \log\min\{|Σ|,M\})$ bits of memory; here $M=n/E$ for approximating the answer with additive error $E$, and $M= \frac{\log n}{\log (1+\varepsilon)}$ for approximating the answer with multiplicative error $(1 + \varepsilon)$. Second, we design three real-time algorithms for this problem. Our Monte Carlo approximation algorithms for both additive and multiplicative versions of the problem use $O(M)$ words of memory. Thus the obtained lower bounds are asymptotically tight up to a logarithmic factor. The third algorithm is deterministic and finds a longest palindrome exactly if it is short. This algorithm can be run in parallel with a Monte Carlo algorithm to obtain better results in practice. Overall, both the time and space complexity of finding a longest palindrome in a stream are essentially settled.

preprint2015arXiv

All Permutations Supersequence is coNP-complete

We prove that deciding whether a given input word contains as subsequence every possible permutation of integers $\{1,2,\ldots,n\}$ is coNP-complete. The coNP-completeness holds even when given the guarantee that the input word contains as subsequences all of length $n-1$ sequences over the same set of integers. We also show NP-completeness of a related problem of Partially Non-crossing Perfect Matching in Bipartite Graphs, i.e. to find a perfect matching in an ordered bipartite graph where edges of the matching incident to selected vertices (even only from one side) are non-crossing.

preprint2015arXiv

Time and space optimality of rotor-router graph exploration

We consider the problem of exploration of an anonymous, port-labeled, undirected graph with $n$ nodes and $m$ edges and diameter $D$, by a single mobile agent. Initially the agent does not know the graph topology nor any of the global parameters. Moreover, the agent does not know the incoming port when entering to a vertex. Each vertex is endowed with memory that can be read and modified by the agent upon its visit to that node. However the agent has no operational memory i.e., it cannot carry any state while traversing an edge. In such a model at least $\log_2 d$ bits are needed at each vertex of degree $d$ for the agent to be able to traverse each graph edge. This number of bits is always sufficient to explore any graph in time $O(mD)$ using algorithm Rotor-Router. We show that even if the available node memory is unlimited then time $Ω(n^3)$ is sometimes required for any algorithm. This shows that Rotor-Router is asymptotically optimal in the worst-case graphs. Secondly we show that for the case of the path the Rotor-Router attains exactly optimal time.

Przemysław Uznański

What is connected

Connect this record

See the researcher in context

Building this map preview

14 published item(s)

A time and space optimal stable population protocol solving exact majority

Robust Comparison in Population Protocols

The Dynamic k-Mismatch Problem

An Efficient Noisy Binary Search in Graphs via Median Approximation

Approximating Text-to-Pattern Distance via Dimensionality Reduction

Cardinality estimation using Gumbel distribution

Improved Circular $k$-Mismatch Sketches

A note on distance labeling in planar graphs

Randomized algorithms for finding a majority element

Sublinear-Space Distance Labeling using Hubs

Tight tradeoffs for approximating palindromes in streams

Tight Tradeoffs for Real-Time Approximation of Longest Palindromes in Streams

All Permutations Supersequence is coNP-complete

Time and space optimality of rotor-router graph exploration