Researcher profile

Anne Driemel

Anne Driemel contributes to research discovery and scholarly infrastructure.

ResearcherAffiliation not importedOpen to collaborate

Trust snapshot

Quick read

Trust 21 - EmergingVerification L1Unclaimed author
10works
0followers
2topics
4close collaborators

Actions

Decide how to stay connected

Follow researcher0

Identity and collaboration

How to connect with this researcher

Claiming links this public author record to a researcher profile and unlocks direct collaboration workflows.

Log in to claim

Direct collaboration

Open a focused conversation when the fit is right

Claim this author entity first to unlock direct invitations.

Research graph

See the researcher in context

Open full explorer

Inspect adjacent work, topics, institutions and collaborators without jumping out to a separate graph page.

Building this graph slice

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Published work

10 published item(s)

preprint2024arXiv

Range Reporting for Time Series via Rectangle Stabbing

We study the Fréchet queries problem. It is a data structure problem, where we are given a set $S$ of $n$ polygonal curves and a distance threshold $ρ$. The data structure should support queries with a polygonal curve $q$ for the elements of $S$, for which the continuous Fréchet distance to $q$ is at most $ρ$. Afshani and Driemel in 2018 studied this problem for two-dimensional polygonal curves and gave upper and lower bounds on the space-query time tradeoff. We study the case that the ambient space of the curves is one-dimensional and show an intimate connection to the well-studied rectangle stabbing problem. Here, we are given a set of hyperrectangles as input and a query with a point $q$ should return all input rectangles that contain this point. Using known data structures for rectangle stabbing or orthogonal range searching this directly leads to a data structure with $\mathcal{O}(n \log ^{t-1} n)$ storage and $\mathcal{O}(\log^{t-1} n+k)$ query time, where $k$ denotes the output size and $t$ can be chosen as the maximum number of vertices of either (a) the stored curves or (b) the query curves. The resulting bounds improve upon the bounds by Afshani and Driemel in both the storage and query time. In addition, we show that known lower bounds for rectangle stabbing and orthogonal range reporting with dimension parameter $d= \lfloor t/2 \rfloor$ can be applied to our problem via reduction. .

preprint2024arXiv

Revisiting the Fréchet distance between piecewise smooth curves

Since its introduction to computational geometry by Alt and Godau in 1992, the Fréchet distance has been a mainstay of algorithmic research on curve similarity computations. The focus of the research has been on comparing polygonal curves, with the notable exception of an algorithm for the decision problem for planar piecewise smooth curves due to Rote (2007). We present an algorithm for the decision problem for piecewise smooth curves that is both conceptually simpler and naturally extends to the first algorithm for the problem for piecewise smooth curves in $\mathbb{R}^d$. We assume that the algorithm is given two continuous curves, each consisting of a sequence of $m$, resp.\ $n$, smooth pieces, where each piece belongs to a sufficiently well-behaved class of curves, such as the set of algebraic curves of bounded degree. We introduce a decomposition of the free space diagram into a controlled number of pieces that can be used to solve the decision problem similarly to the polygonal case, in $O(mn)$ time, leading to a computation of the Fréchet distance that runs in $O(mn\log(mn))$ time. Furthermore, we study approximation algorithms for piecewise smooth curves that are also $c$-packed for some fixed value $c$. We adapt the existing framework for $(1+ε)$-approximations and show that an approximate decision can be computed in $O(cn/ε)$ time for any $ε> 0$.

preprint2022arXiv

Analysis of a Greedy Heuristic for the Labeling of a Map with a Time-Window Interface

In this paper, we analyze the approximation quality of a greedy heuristic for automatic map labeling. As input, we have a set of events, each associated with a label at a fixed position, a timestamp, and a weight. Let a time-window labeling be a selection of these labels such that all corresponding timestamps lie in a queried time window and no two labels overlap. A solution to the time-window labeling problem consists of a data structure that encodes a time-window labeling for each possible time window; when a user specifies a time window of interest using a slider interface, we query the data structure for the corresponding labeling. We define the quality of a time-window labeling solution as the sum of the weights of the labels in each time-window labeling, integrated over all time windows. We aim at maximizing the quality under the condition that a label may never disappear when the user shrinks the time window. In this paper, we analyze how well a greedy heuristic approximates the maximum quality that can be realized under this condition. On the one hand, we present an instance with square labels of equal size and equal weight for which the greedy heuristic fails to find a solution of at least 1/4 of the quality of an optimal solution. On the other hand, we prove that the greedy heuristic does guarantee a solution with at least 1/8 of the quality of an optimal solution. In the case of disk-shaped labels of equal size and equal weight, the greedy heuristic gives a solution with at least 1/10 of the quality of an optimal solution. If the labels are squares or disks of equal size and the maximum weight divided by the minimum weight is at most b, then the greedy heuristic has approximation ratio Theta(log b).

preprint2022arXiv

Approximating Length-Restricted Means under Dynamic Time Warping

We study variants of the mean problem under the $p$-Dynamic Time Warping ($p$-DTW) distance, a popular and robust distance measure for sequential data. In our setting we are given a set of finite point sequences over an arbitrary metric space and we want to compute a mean point sequence of given length that minimizes the sum of $p$-DTW distances, each raised to the $q$\textsuperscript{th} power, between the input sequences and the mean sequence. In general, the problem is $\mathrm{NP}$-hard and known not to be fixed-parameter tractable in the number of sequences. On the positive side, we show that restricting the length of the mean sequence significantly reduces the hardness of the problem. We give an exact algorithm running in polynomial time for constant-length means. We explore various approximation algorithms that provide a trade-off between the approximation factor and the running time. Our approximation algorithms have a running time with only linear dependency on the number of input sequences. In addition, we use our mean algorithms to obtain clustering algorithms with theoretical guarantees.

preprint2022arXiv

Faster Approximate Covering of Subcurves under the Fréchet Distance

Subtrajectory clustering is an important variant of the trajectory clustering problem, where the start and endpoints of trajectory patterns within the collected trajectory data are not known in advance. We study this problem in the form of a set cover problem for a given polygonal curve: find the smallest number $k$ of representative curves such that any point on the input curve is contained in a subcurve that has Fréchet distance at most a given $Δ$ to a representative curve. We focus on the case where the representative curves are line segments and approach this NP-hard problem with classical techniques from the area of geometric set cover: we use a variant of the multiplicative weights update method which was first suggested by Brönniman and Goodrich for set cover instances with small VC-dimension. We obtain a bicriteria-approximation algorithm that computes a set of $O(k\log(k))$ line segments that cover a given polygonal curve of $n$ vertices under Fréchet distance at most $O(Δ)$. We show that the algorithm runs in $\widetilde{O}(k^2 n + k n^3)$ time in expectation and uses $ \widetilde{O}(k n + n^3)$ space. For two dimensional input curves that are $c$-packed, we bound the expected running time by $\widetilde{O}(k^2 c^2 n)$ and the space by $ \widetilde{O}(kn + c^2 n)$. In $\mathbb{R}^d$ the dependency on $n$ instead is quadratic. In addition, we present a variant of the algorithm that uses implicit weight updates on the candidate set and thereby achieves near-linear running time in $n$ without any assumptions on the input curve, while keeping the same approximation bounds. This comes at the expense of a small (polylogarithmic) dependency on the relative arclength.

preprint2022arXiv

Minimum-Error Triangulations for Sea Surface Reconstruction

We apply state-of-the-art computational geometry methods to the problem of reconstructing a time-varying sea surface from tide gauge records. Our work builds on a recent article by Nitzke et al.~(Computers \& Geosciences, 157:104920, 2021) who have suggested to learn a triangulation $D$ of a given set of tide gauge stations. The objective is to minimize the misfit of the piecewise linear surface induced by $D$ to a reference surface that has been acquired with satellite altimetry. The authors restricted their search to k-order Delaunay ($k$-OD) triangulations and used an integer linear program in order to solve the resulting optimization problem. In geometric terms, the input to our problem consists of two sets of points in $\mathbb{R}^2$ with elevations: a set $\mathcal{S}$ that is to be triangulated, and a set $\mathcal{R}$ of reference points. Intuitively, we define the error of a triangulation as the average vertical distance of a point in $\mathcal{R}$ to the triangulated surface that is obtained by interpolating elevations of $\mathcal{S}$ linearly in each triangle. Our goal is to find the triangulation of $\mathcal{S}$ that has minimum error with respect to $\mathcal{R}$. In our work, we prove that the minimum-error triangulation problem is NP-hard and cannot be approximated within any multiplicative factor in polynomial time unless $P=NP$. At the same time we show that the problem instances that occur in our application (considering sea level data from several hundreds of tide gauge stations worldwide) can be solved relatively fast using dynamic programming when restricted to $k$-OD triangulations for $k\le 7$. In particular, instances for which the number of connected components of the so-called $k$-OD fixed-edge graph is small can be solved within few seconds.

preprint2022arXiv

On Computing the $k$-Shortcut Fréchet Distance

The Fréchet distance is a popular measure of dissimilarity for polygonal curves. It is defined as a min-max formulation that considers all direction-preserving continuous bijections of the two curves. Because of its susceptibility to noise, Driemel and Har-Peled introduced the shortcut Fréchet distance in 2012, where one is allowed to take shortcuts along one of the curves, similar to the edit distance for sequences. We analyse the parameterized version of this problem, where the number of shortcuts is bounded by a parameter $k$. The corresponding decision problem can be stated as follows: Given two polygonal curves $T$ and $B$ of at most $n$ vertices, a parameter $k$ and a distance threshold $δ$, is it possible to introduce $k$ shortcuts along $B$ such that the Fréchet distance of the resulting curve and the curve $T$ is at most $δ$? We study this problem for polygonal curves in the plane. We provide a complexity analysis for this problem with the following results: (i) assuming the exponential-time-hypothesis (ETH), there exists no algorithm with running time bounded by $n^{o(k)}$; (ii) there exists a decision algorithm with running time in $O(kn^{2k+2}\log n)$. In contrast, we also show that efficient approximate decider algorithms are possible, even when $k$ is large. We present a $(3+\varepsilon)$-approximate decider algorithm with running time in $O(k n^2 \log^2 n)$ for fixed $\varepsilon$. In addition, we can show that, if $k$ is a constant and the two curves are $c$-packed for some constant $c$, then the approximate decider algorithm runs in near-linear time.

preprint2022arXiv

Pattern matching under DTW distance

In this work, we consider the problem of pattern matching under the dynamic time warping (DTW) distance motivated by potential applications in the analysis of biological data produced by the third generation sequencing. To measure the DTW distance between two strings, one must "warp" them, that is, double some letters in the strings to obtain two equal-lengths strings, and then sum the distances between the letters in the corresponding positions. When the distances between letters are integers, we show that for a pattern P with m runs and a text T with n runs: 1. There is an O(m + n)-time algorithm that computes all locations where the DTW distance from P to T is at most 1; 2. There is an O(kmn)-time algorithm that computes all locations where the DTW distance from P to T is at most k. As a corollary of the second result, we also derive an approximation algorithm for general metrics on the alphabet.

preprint2021arXiv

$(2+ε)$-ANN for time series under the Fréchet distance

We study approximate-near-neighbor data structures for time series under the continuous Fréchet distance. For an attainable approximation factor $c>1$ and a query radius $r$, an approximate-near-neighbor data structure can be used to preprocess $n$ curves in $\mathbb{R}$ (aka time series), each of complexity $m$, to answer queries with a curve of complexity $k$ by either returning a curve that lies within Fréchet distance $cr$, or answering that there exists no curve in the input within distance $r$. In both cases, the answer is correct. Our first data structure achieves a $(5+ε)$ approximation factor, uses space in $n\cdot \mathcal{O}\left({ε^{-1}}\right)^{k} + \mathcal{O}(nm)$ and has query time in $\mathcal{O}\left(k\right)$. Our second data structure achieves a $(2+ε)$ approximation factor, uses space in $n\cdot \mathcal{O}\left(\frac{m}{kε}\right)^{k} + \mathcal{O}(nm)$ and has query time in $\mathcal{O}\left(k\cdot 2^k\right)$. Our third positive result is a probabilistic data structure based on locality-sensitive hashing, which achieves space in $\mathcal{O}(n\log n+nm)$ and query time in $\mathcal{O}(k\log n)$, and which answers queries with an approximation factor in $\mathcal{O}(k)$. All of our data structures make use of the concept of signatures, which were originally introduced for the problem of clustering time series under the Fréchet distance. In addition, we show lower bounds for this problem. Consider any data structure which achieves an approximation factor less than $2$ and which supports curves of arclength up to $L$ and answers the query using only a constant number of probes. We show that under reasonable assumptions on the word size any such data structure needs space in $L^{Ω(k)}$.

preprint2020arXiv

On the hardness of computing an average curve

We study the complexity of clustering curves under $k$-median and $k$-center objectives in the metric space of the Fréchet distance and related distance measures. Building upon recent hardness results for the minimum-enclosing-ball problem under the Fréchet distance, we show that also the $1$-median problem is NP-hard. Furthermore, we show that the $1$-median problem is W[1]-hard with the number of curves as parameter. We show this under the discrete and continuous Fréchet and Dynamic Time Warping (DTW) distance. This yields an independent proof of an earlier result by Bulteau et al. from 2018 for a variant of DTW that uses squared distances, where the new proof is both simpler and more general. On the positive side, we give approximation algorithms for problem variants where the center curve may have complexity at most $\ell$ under the discrete Fréchet distance. In particular, for fixed $k,\ell$ and $\varepsilon$, we give $(1+\varepsilon)$-approximation algorithms for the $(k,\ell)$-median and $(k,\ell)$-center objectives and a polynomial-time exact algorithm for the $(k,\ell)$-center objective.