Source author record

Anne Driemel

Anne Driemel appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Computational Geometry Data Structures and Algorithms

Catalog footprint

What is connected

17works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2024arXiv

Range Reporting for Time Series via Rectangle Stabbing

We study the Fréchet queries problem. It is a data structure problem, where we are given a set $S$ of $n$ polygonal curves and a distance threshold $ρ$. The data structure should support queries with a polygonal curve $q$ for the elements of $S$, for which the continuous Fréchet distance to $q$ is at most $ρ$. Afshani and Driemel in 2018 studied this problem for two-dimensional polygonal curves and gave upper and lower bounds on the space-query time tradeoff. We study the case that the ambient space of the curves is one-dimensional and show an intimate connection to the well-studied rectangle stabbing problem. Here, we are given a set of hyperrectangles as input and a query with a point $q$ should return all input rectangles that contain this point. Using known data structures for rectangle stabbing or orthogonal range searching this directly leads to a data structure with $\mathcal{O}(n \log ^{t-1} n)$ storage and $\mathcal{O}(\log^{t-1} n+k)$ query time, where $k$ denotes the output size and $t$ can be chosen as the maximum number of vertices of either (a) the stored curves or (b) the query curves. The resulting bounds improve upon the bounds by Afshani and Driemel in both the storage and query time. In addition, we show that known lower bounds for rectangle stabbing and orthogonal range reporting with dimension parameter $d= \lfloor t/2 \rfloor$ can be applied to our problem via reduction. .

preprint2024arXiv

Revisiting the Fréchet distance between piecewise smooth curves

Since its introduction to computational geometry by Alt and Godau in 1992, the Fréchet distance has been a mainstay of algorithmic research on curve similarity computations. The focus of the research has been on comparing polygonal curves, with the notable exception of an algorithm for the decision problem for planar piecewise smooth curves due to Rote (2007). We present an algorithm for the decision problem for piecewise smooth curves that is both conceptually simpler and naturally extends to the first algorithm for the problem for piecewise smooth curves in $\mathbb{R}^d$. We assume that the algorithm is given two continuous curves, each consisting of a sequence of $m$, resp.\ $n$, smooth pieces, where each piece belongs to a sufficiently well-behaved class of curves, such as the set of algebraic curves of bounded degree. We introduce a decomposition of the free space diagram into a controlled number of pieces that can be used to solve the decision problem similarly to the polygonal case, in $O(mn)$ time, leading to a computation of the Fréchet distance that runs in $O(mn\log(mn))$ time. Furthermore, we study approximation algorithms for piecewise smooth curves that are also $c$-packed for some fixed value $c$. We adapt the existing framework for $(1+ε)$-approximations and show that an approximate decision can be computed in $O(cn/ε)$ time for any $ε> 0$.

preprint2022arXiv

Analysis of a Greedy Heuristic for the Labeling of a Map with a Time-Window Interface

In this paper, we analyze the approximation quality of a greedy heuristic for automatic map labeling. As input, we have a set of events, each associated with a label at a fixed position, a timestamp, and a weight. Let a time-window labeling be a selection of these labels such that all corresponding timestamps lie in a queried time window and no two labels overlap. A solution to the time-window labeling problem consists of a data structure that encodes a time-window labeling for each possible time window; when a user specifies a time window of interest using a slider interface, we query the data structure for the corresponding labeling. We define the quality of a time-window labeling solution as the sum of the weights of the labels in each time-window labeling, integrated over all time windows. We aim at maximizing the quality under the condition that a label may never disappear when the user shrinks the time window. In this paper, we analyze how well a greedy heuristic approximates the maximum quality that can be realized under this condition. On the one hand, we present an instance with square labels of equal size and equal weight for which the greedy heuristic fails to find a solution of at least 1/4 of the quality of an optimal solution. On the other hand, we prove that the greedy heuristic does guarantee a solution with at least 1/8 of the quality of an optimal solution. In the case of disk-shaped labels of equal size and equal weight, the greedy heuristic gives a solution with at least 1/10 of the quality of an optimal solution. If the labels are squares or disks of equal size and the maximum weight divided by the minimum weight is at most b, then the greedy heuristic has approximation ratio Theta(log b).

preprint2022arXiv

Approximating Length-Restricted Means under Dynamic Time Warping

We study variants of the mean problem under the $p$-Dynamic Time Warping ($p$-DTW) distance, a popular and robust distance measure for sequential data. In our setting we are given a set of finite point sequences over an arbitrary metric space and we want to compute a mean point sequence of given length that minimizes the sum of $p$-DTW distances, each raised to the $q$\textsuperscript{th} power, between the input sequences and the mean sequence. In general, the problem is $\mathrm{NP}$-hard and known not to be fixed-parameter tractable in the number of sequences. On the positive side, we show that restricting the length of the mean sequence significantly reduces the hardness of the problem. We give an exact algorithm running in polynomial time for constant-length means. We explore various approximation algorithms that provide a trade-off between the approximation factor and the running time. Our approximation algorithms have a running time with only linear dependency on the number of input sequences. In addition, we use our mean algorithms to obtain clustering algorithms with theoretical guarantees.

preprint2022arXiv

Faster Approximate Covering of Subcurves under the Fréchet Distance

Subtrajectory clustering is an important variant of the trajectory clustering problem, where the start and endpoints of trajectory patterns within the collected trajectory data are not known in advance. We study this problem in the form of a set cover problem for a given polygonal curve: find the smallest number $k$ of representative curves such that any point on the input curve is contained in a subcurve that has Fréchet distance at most a given $Δ$ to a representative curve. We focus on the case where the representative curves are line segments and approach this NP-hard problem with classical techniques from the area of geometric set cover: we use a variant of the multiplicative weights update method which was first suggested by Brönniman and Goodrich for set cover instances with small VC-dimension. We obtain a bicriteria-approximation algorithm that computes a set of $O(k\log(k))$ line segments that cover a given polygonal curve of $n$ vertices under Fréchet distance at most $O(Δ)$. We show that the algorithm runs in $\widetilde{O}(k^2 n + k n^3)$ time in expectation and uses $ \widetilde{O}(k n + n^3)$ space. For two dimensional input curves that are $c$-packed, we bound the expected running time by $\widetilde{O}(k^2 c^2 n)$ and the space by $ \widetilde{O}(kn + c^2 n)$. In $\mathbb{R}^d$ the dependency on $n$ instead is quadratic. In addition, we present a variant of the algorithm that uses implicit weight updates on the candidate set and thereby achieves near-linear running time in $n$ without any assumptions on the input curve, while keeping the same approximation bounds. This comes at the expense of a small (polylogarithmic) dependency on the relative arclength.

preprint2022arXiv

Minimum-Error Triangulations for Sea Surface Reconstruction

We apply state-of-the-art computational geometry methods to the problem of reconstructing a time-varying sea surface from tide gauge records. Our work builds on a recent article by Nitzke et al.~(Computers \& Geosciences, 157:104920, 2021) who have suggested to learn a triangulation $D$ of a given set of tide gauge stations. The objective is to minimize the misfit of the piecewise linear surface induced by $D$ to a reference surface that has been acquired with satellite altimetry. The authors restricted their search to k-order Delaunay ($k$-OD) triangulations and used an integer linear program in order to solve the resulting optimization problem. In geometric terms, the input to our problem consists of two sets of points in $\mathbb{R}^2$ with elevations: a set $\mathcal{S}$ that is to be triangulated, and a set $\mathcal{R}$ of reference points. Intuitively, we define the error of a triangulation as the average vertical distance of a point in $\mathcal{R}$ to the triangulated surface that is obtained by interpolating elevations of $\mathcal{S}$ linearly in each triangle. Our goal is to find the triangulation of $\mathcal{S}$ that has minimum error with respect to $\mathcal{R}$. In our work, we prove that the minimum-error triangulation problem is NP-hard and cannot be approximated within any multiplicative factor in polynomial time unless $P=NP$. At the same time we show that the problem instances that occur in our application (considering sea level data from several hundreds of tide gauge stations worldwide) can be solved relatively fast using dynamic programming when restricted to $k$-OD triangulations for $k\le 7$. In particular, instances for which the number of connected components of the so-called $k$-OD fixed-edge graph is small can be solved within few seconds.

preprint2022arXiv

On Computing the $k$-Shortcut Fréchet Distance

The Fréchet distance is a popular measure of dissimilarity for polygonal curves. It is defined as a min-max formulation that considers all direction-preserving continuous bijections of the two curves. Because of its susceptibility to noise, Driemel and Har-Peled introduced the shortcut Fréchet distance in 2012, where one is allowed to take shortcuts along one of the curves, similar to the edit distance for sequences. We analyse the parameterized version of this problem, where the number of shortcuts is bounded by a parameter $k$. The corresponding decision problem can be stated as follows: Given two polygonal curves $T$ and $B$ of at most $n$ vertices, a parameter $k$ and a distance threshold $δ$, is it possible to introduce $k$ shortcuts along $B$ such that the Fréchet distance of the resulting curve and the curve $T$ is at most $δ$? We study this problem for polygonal curves in the plane. We provide a complexity analysis for this problem with the following results: (i) assuming the exponential-time-hypothesis (ETH), there exists no algorithm with running time bounded by $n^{o(k)}$; (ii) there exists a decision algorithm with running time in $O(kn^{2k+2}\log n)$. In contrast, we also show that efficient approximate decider algorithms are possible, even when $k$ is large. We present a $(3+\varepsilon)$-approximate decider algorithm with running time in $O(k n^2 \log^2 n)$ for fixed $\varepsilon$. In addition, we can show that, if $k$ is a constant and the two curves are $c$-packed for some constant $c$, then the approximate decider algorithm runs in near-linear time.

preprint2022arXiv

Pattern matching under DTW distance

In this work, we consider the problem of pattern matching under the dynamic time warping (DTW) distance motivated by potential applications in the analysis of biological data produced by the third generation sequencing. To measure the DTW distance between two strings, one must "warp" them, that is, double some letters in the strings to obtain two equal-lengths strings, and then sum the distances between the letters in the corresponding positions. When the distances between letters are integers, we show that for a pattern P with m runs and a text T with n runs: 1. There is an O(m + n)-time algorithm that computes all locations where the DTW distance from P to T is at most 1; 2. There is an O(kmn)-time algorithm that computes all locations where the DTW distance from P to T is at most k. As a corollary of the second result, we also derive an approximation algorithm for general metrics on the alphabet.

preprint2021arXiv

$(2+ε)$-ANN for time series under the Fréchet distance

We study approximate-near-neighbor data structures for time series under the continuous Fréchet distance. For an attainable approximation factor $c>1$ and a query radius $r$, an approximate-near-neighbor data structure can be used to preprocess $n$ curves in $\mathbb{R}$ (aka time series), each of complexity $m$, to answer queries with a curve of complexity $k$ by either returning a curve that lies within Fréchet distance $cr$, or answering that there exists no curve in the input within distance $r$. In both cases, the answer is correct. Our first data structure achieves a $(5+ε)$ approximation factor, uses space in $n\cdot \mathcal{O}\left({ε^{-1}}\right)^{k} + \mathcal{O}(nm)$ and has query time in $\mathcal{O}\left(k\right)$. Our second data structure achieves a $(2+ε)$ approximation factor, uses space in $n\cdot \mathcal{O}\left(\frac{m}{kε}\right)^{k} + \mathcal{O}(nm)$ and has query time in $\mathcal{O}\left(k\cdot 2^k\right)$. Our third positive result is a probabilistic data structure based on locality-sensitive hashing, which achieves space in $\mathcal{O}(n\log n+nm)$ and query time in $\mathcal{O}(k\log n)$, and which answers queries with an approximation factor in $\mathcal{O}(k)$. All of our data structures make use of the concept of signatures, which were originally introduced for the problem of clustering time series under the Fréchet distance. In addition, we show lower bounds for this problem. Consider any data structure which achieves an approximation factor less than $2$ and which supports curves of arclength up to $L$ and answers the query using only a constant number of probes. We show that under reasonable assumptions on the word size any such data structure needs space in $L^{Ω(k)}$.

preprint2020arXiv

On the hardness of computing an average curve

We study the complexity of clustering curves under $k$-median and $k$-center objectives in the metric space of the Fréchet distance and related distance measures. Building upon recent hardness results for the minimum-enclosing-ball problem under the Fréchet distance, we show that also the $1$-median problem is NP-hard. Furthermore, we show that the $1$-median problem is W[1]-hard with the number of curves as parameter. We show this under the discrete and continuous Fréchet and Dynamic Time Warping (DTW) distance. This yields an independent proof of an earlier result by Bulteau et al. from 2018 for a variant of DTW that uses squared distances, where the new proof is both simpler and more general. On the positive side, we give approximation algorithms for problem variants where the center curve may have complexity at most $\ell$ under the discrete Fréchet distance. In particular, for fixed $k,\ell$ and $\varepsilon$, we give $(1+\varepsilon)$-approximation algorithms for the $(k,\ell)$-median and $(k,\ell)$-center objectives and a polynomial-time exact algorithm for the $(k,\ell)$-center objective.

preprint2015arXiv

Clustering time series under the Fréchet distance

The Fréchet distance is a popular distance measure for curves. We study the problem of clustering time series under the Fréchet distance. In particular, we give $(1+\varepsilon)$-approximation algorithms for variations of the following problem with parameters $k$ and $\ell$. Given $n$ univariate time series $P$, each of complexity at most $m$, we find $k$ time series, not necessarily from $P$, which we call \emph{cluster centers} and which each have complexity at most $\ell$, such that (a) the maximum distance of an element of $P$ to its nearest cluster center or (b) the sum of these distances is minimized. Our algorithms have running time near-linear in the input size for constant $\varepsilon$, $k$ and $\ell$. To the best of our knowledge, our algorithms are the first clustering algorithms for the Fréchet distance which achieve an approximation factor of $(1+\varepsilon)$ or better. Keywords: time series, longitudinal data, functional data, clustering, Fréchet distance, dynamic time warping, approximation algorithms.

preprint2013arXiv

Computing the Fréchet distance with shortcuts is NP-hard

We study the shortcut Fréchet distance, a natural variant of the Fréchet distance, that allows us to take shortcuts from and to any point along one of the curves. The classic Fréchet distance is a bottle-neck distance measure and hence quite sensitive to outliers. The shortcut Fréchet distance allows us to cut across outliers and hence produces significantly more meaningful results when dealing with real world data. Driemel and Har-Peled recently described approximation algorithms for the restricted case where shortcuts have to start and end at input vertices. We show that, in the general case, the problem of computing the shortcut Fréchet distance is NP-hard. This is the first hardness result for a variant of the Fréchet distance between two polygonal curves in the plane. We also present two algorithms for the decision problem: a 3-approximation algorithm for the general case and an exact algorithm for the vertex-restricted case. Both algorithms run in O(n^3 log n) time.

preprint2013arXiv

Jaywalking your Dog - Computing the Fréchet Distance with Shortcuts

The similarity of two polygonal curves can be measured using the Fréchet distance. We introduce the notion of a more robust Fréchet distance, where one is allowed to shortcut between vertices of one of the curves. This is a natural approach for handling noise, in particular batched outliers. We compute a (3+\eps)-approximation to the minimum Fréchet distance over all possible such shortcuts, in near linear time, if the curve is c-packed and the number of shortcuts is either small or unbounded. To facilitate the new algorithm we develop several new tools: (A) A data structure for preprocessing a curve (not necessarily c-packed) that supports (1+\eps)-approximate Fréchet distance queries between a subcurve (of the original curve) and a line segment. (B) A near linear time algorithm that computes a permutation of the vertices of a curve, such that any prefix of 2k-1 vertices of this permutation, form an optimal approximation (up to a constant factor) to the original curve compared to any polygonal curve with k vertices, for any k > 0. (C) A data structure for preprocessing a curve that supports approximate Fréchet distance queries between a subcurve and query polygonal curve. The query time depends quadratically on the complexity of the query curve, and only (roughly) logarithmically on the complexity of the original curve. To our knowledge, these are the first data structures to support these kind of queries efficiently.

preprint2012arXiv

Approximating the Fréchet Distance for Realistic Curves in Near Linear Time

We present simple and practical $(1+\eps)$-approximation algorithm for the Frechet distance between curves. To analyze this algorithm we introduce a new realistic family of curves, $c$-packed curves, that is closed under simplification. We believe the notion of $c$-packed curves to be of independent interest. We show that our algorithm has near linear running time for $c$-packed curves, and show similar results for other input models.

preprint2012arXiv

Flow Computations on Imprecise Terrains

We study the computation of the flow of water on imprecise terrains. We consider two approaches to modeling flow on a terrain: one where water flows across the surface of a polyhedral terrain in the direction of steepest descent, and one where water only flows along the edges of a predefined graph, for example a grid or a triangulation. In both cases each vertex has an imprecise elevation, given by an interval of possible values, while its (x,y)-coordinates are fixed. For the first model, we show that the problem of deciding whether one vertex may be contained in the watershed of another is NP-hard. In contrast, for the second model we give a simple O(n log n) time algorithm to compute the minimal and the maximal watershed of a vertex, where n is the number of edges of the graph. On a grid model, we can compute the same in O(n) time.

preprint2011arXiv

Computing the Fréchet Distance Between Folded Polygons

Computing the Fréchet distance for surfaces is a surprisingly hard problem and the only known algorithm is limited to computing it between flat surfaces. We adapt this algorithm to create one for computing the Fréchet distance for a class of non-flat surfaces which we call folded polygons. Unfortunately, the original algorithm cannot be extended directly. We present three different methods to adapt it. The first of which is a fixed-parameter tractable algorithm. The second is a polynomial-time approximation algorithm. Finally, we present a restricted class of folded polygons for which we can compute the Fréchet distance in polynomial time.

preprint2011arXiv

On the Expected Complexity of Voronoi Diagrams on Terrains

We investigate the combinatorial complexity of geodesic Voronoi diagrams on polyhedral terrains using a probabilistic analysis. Aronov etal [ABT08] prove that, if one makes certain realistic input assumptions on the terrain, this complexity is Θ(n + m \sqrt n) in the worst case, where n denotes the number of triangles that define the terrain and m denotes the number of Voronoi sites. We prove that under a relaxed set of assumptions the Voronoi diagram has expected complexity O(n+m), given that the sites have a uniform distribution on the domain of the terrain(or the surface of the terrain). Furthermore, we present a worst-case construction of a terrain which implies a lower bound of Vmega(n m2/3) on the expected worst-case complexity if these assumptions on the terrain are dropped. As an additional result, we can show that the expected fatness of a cell in a random planar Voronoi diagram is bounded by a constant.

Anne Driemel

What is connected

Connect this record

See the researcher in context

Building this map preview

17 published item(s)

Range Reporting for Time Series via Rectangle Stabbing

Revisiting the Fréchet distance between piecewise smooth curves

Analysis of a Greedy Heuristic for the Labeling of a Map with a Time-Window Interface

Approximating Length-Restricted Means under Dynamic Time Warping

Faster Approximate Covering of Subcurves under the Fréchet Distance

Minimum-Error Triangulations for Sea Surface Reconstruction

On Computing the $k$-Shortcut Fréchet Distance

Pattern matching under DTW distance

$(2+ε)$-ANN for time series under the Fréchet distance

On the hardness of computing an average curve

Clustering time series under the Fréchet distance

Computing the Fréchet distance with shortcuts is NP-hard

Jaywalking your Dog - Computing the Fréchet Distance with Shortcuts

Approximating the Fréchet Distance for Realistic Curves in Near Linear Time

Flow Computations on Imprecise Terrains

Computing the Fréchet Distance Between Folded Polygons

On the Expected Complexity of Voronoi Diagrams on Terrains