Researcher profile

Panagiotis Charalampopoulos

Panagiotis Charalampopoulos contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
6works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

6 published item(s)

preprint2022arXiv

Faster Pattern Matching under Edit Distance

We consider the approximate pattern matching problem under the edit distance. Given a text $T$ of length $n$, a pattern $P$ of length $m$, and a threshold $k$, the task is to find the starting positions of all substrings of $T$ that can be transformed to $P$ with at most $k$ edits. More than 20 years ago, Cole and Hariharan [SODA'98, J. Comput.'02] gave an $\mathcal{O}(n+k^4 \cdot n/ m)$-time algorithm for this classic problem, and this runtime has not been improved since. Here, we present an algorithm that runs in time $\mathcal{O}(n+k^{3.5} \sqrt{\log m \log k} \cdot n/m)$, thus breaking through this long-standing barrier. In the case where $n^{1/4+\varepsilon} \leq k \leq n^{2/5-\varepsilon}$ for some arbitrarily small positive constant $\varepsilon$, our algorithm improves over the state-of-the-art by polynomial factors: it is polynomially faster than both the algorithm of Cole and Hariharan and the classic $\mathcal{O}(kn)$-time algorithm of Landau and Vishkin [STOC'86, J. Algorithms'89]. We observe that the bottleneck case of the alternative $\mathcal{O}(n+k^4 \cdot n/m)$-time algorithm of Charalampopoulos, Kociumaka, and Wellnitz [FOCS'20] is when the text and the pattern are (almost) periodic. Our new algorithm reduces this case to a new dynamic problem (Dynamic Puzzle Matching), which we solve by building on tools developed by Tiskin [SODA'10, Algorithmica'15] for the so-called seaweed monoid of permutation matrices. Our algorithm relies only on a small set of primitive operations on strings and thus also applies to the fully-compressed setting (where text and pattern are given as straight-line programs) and to the dynamic setting (where we maintain a collection of strings under creation, splitting, and concatenation), improving over the state of the art.

preprint2021arXiv

An Almost Optimal Edit Distance Oracle

We consider the problem of preprocessing two strings $S$ and $T$, of lengths $m$ and $n$, respectively, in order to be able to efficiently answer the following queries: Given positions $i,j$ in $S$ and positions $a,b$ in $T$, return the optimal alignment of $S[i \mathinner{.\,.} j]$ and $T[a \mathinner{.\,.} b]$. Let $N=mn$. We present an oracle with preprocessing time $N^{1+o(1)}$ and space $N^{1+o(1)}$ that answers queries in $\log^{2+o(1)}N$ time. In other words, we show that we can query the alignment of every two substrings in almost the same time it takes to compute just the alignment of $S$ and $T$. Our oracle uses ideas from our distance oracle for planar graphs [STOC 2019] and exploits the special structure of the alignment graph. Conditioned on popular hardness conjectures, this result is optimal up to subpolynomial factors. Our results apply to both edit distance and longest common subsequence (LCS). The best previously known oracle with construction time and size $\mathcal{O}(N)$ has slow $Ω(\sqrt{N})$ query time [Sakai, TCS 2019], and the one with size $N^{1+o(1)}$ and query time $\log^{2+o(1)}N$ (using a planar graph distance oracle) has slow $Ω(N^{3/2})$ construction time [Long & Pettie, SODA 2021]. We improve both approaches by roughly a $\sqrt N$ factor.

preprint2021arXiv

Fault-Tolerant Distance Labeling for Planar Graphs

In fault-tolerant distance labeling we wish to assign short labels to the vertices of a graph $G$ such that from the labels of any three vertices $u,v,f$ we can infer the $u$-to-$v$ distance in the graph $G\setminus \{f\}$. We show that any directed weighted planar graph (and in fact any graph in a graph family with $O(\sqrt{n})$-size separators, such as minor-free graphs) admits fault-tolerant distance labels of size $O(n^{2/3})$. We extend these labels in a way that allows us to also count the number of shortest paths, and provide additional upper and lower bounds for labels and oracles for counting shortest paths.

preprint2020arXiv

Circular Pattern Matching with $k$ Mismatches

The $k$-mismatch problem consists in computing the Hamming distance between a pattern $P$ of length $m$ and every length-$m$ substring of a text $T$ of length $n$, if this distance is no more than $k$. In many real-world applications, any cyclic rotation of $P$ is a relevant pattern, and thus one is interested in computing the minimal distance of every length-$m$ substring of $T$ and any cyclic rotation of $P$. This is the circular pattern matching with $k$ mismatches ($k$-CPM) problem. A multitude of papers have been devoted to solving this problem but, to the best of our knowledge, only average-case upper bounds are known. In this paper, we present the first non-trivial worst-case upper bounds for the $k$-CPM problem. Specifically, we show an $O(nk)$-time algorithm and an $O(n+\frac{n}{m}\,k^4)$-time algorithm. The latter algorithm applies in an extended way a technique that was very recently developed for the $k$-mismatch problem [Bringmann et al., SODA 2019]. A preliminary version of this work appeared at FCT 2019. In this version we improve the time complexity of the main algorithm from $O(n+\frac{n}{m}\,k^5)$ to $O(n+\frac{n}{m}\,k^4)$.

preprint2020arXiv

Counting Distinct Patterns in Internal Dictionary Matching

We consider the problem of preprocessing a text $T$ of length $n$ and a dictionary $\mathcal{D}$ in order to be able to efficiently answer queries $CountDistinct(i,j)$, that is, given $i$ and $j$ return the number of patterns from $\mathcal{D}$ that occur in the fragment $T[i \mathinner{.\,.} j]$. The dictionary is internal in the sense that each pattern in $\mathcal{D}$ is given as a fragment of $T$. This way, the dictionary takes space proportional to the number of patterns $d=|\mathcal{D}|$ rather than their total length, which could be $Θ(n\cdot d)$. An $\tilde{\mathcal{O}}(n+d)$-size data structure that answers $CountDistinct(i,j)$ queries $\mathcal{O}(\log n)$-approximately in $\tilde{\mathcal{O}}(1)$ time was recently proposed in a work that introduced internal dictionary matching [ISAAC 2019]. Here we present an $\tilde{\mathcal{O}}(n+d)$-size data structure that answers $CountDistinct(i,j)$ queries $2$-approximately in $\tilde{\mathcal{O}}(1)$ time. Using range queries, for any $m$, we give an $\tilde{\mathcal{O}}(\min(nd/m,n^2/m^2)+d)$-size data structure that answers $CountDistinct(i,j)$ queries exactly in $\tilde{\mathcal{O}}(m)$ time. We also consider the special case when the dictionary consists of all square factors of the string. We design an $\mathcal{O}(n \log^2 n)$-size data structure that allows us to count distinct squares in a text fragment $T[i \mathinner{.\,.} j]$ in $\mathcal{O}(\log n)$ time.

preprint2020arXiv

The Number of Repetitions in 2D-Strings

The notions of periodicity and repetitions in strings, and hence these of runs and squares, naturally extend to two-dimensional strings. We consider two types of repetitions in 2D-strings: 2D-runs and quartics (quartics are a 2D-version of squares in standard strings). Amir et al. introduced 2D-runs, showed that there are $O(n^3)$ of them in an $n \times n$ 2D-string and presented a simple construction giving a lower bound of $Ω(n^2)$ for their number (TCS 2020). We make a significant step towards closing the gap between these bounds by showing that the number of 2D-runs in an $n \times n$ 2D-string is $O(n^2 \log^2 n)$. In particular, our bound implies that the $O(n^2\log n + \textsf{output})$ run-time of the algorithm of Amir et al. for computing 2D-runs is also $O(n^2 \log^2 n)$. We expect this result to allow for exploiting 2D-runs algorithmically in the area of 2D pattern matching. A quartic is a 2D-string composed of $2 \times 2$ identical blocks (2D-strings) that was introduced by Apostolico and Brimkov (TCS 2000), where by quartics they meant only primitively rooted quartics, i.e. built of a primitive block. Here our notion of quartics is more general and analogous to that of squares in 1D-strings. Apostolico and Brimkov showed that there are $O(n^2 \log^2 n)$ occurrences of primitively rooted quartics in an $n \times n$ 2D-string and that this bound is attainable. Consequently the number of distinct primitively rooted quartics is $O(n^2 \log^2 n)$. Here, we prove that the number of distinct general quartics is also $O(n^2 \log^2 n)$. This extends the rich combinatorial study of the number of distinct squares in a 1D-string, that was initiated by Fraenkel and Simpson (J. Comb. Theory A 1998), to two dimensions. Finally, we show some algorithmic applications of 2D-runs. (Abstract shortened due to arXiv requirements.)