Paper detail

Lower Bounds on Performance of Metric Tree Indexing Schemes for Exact Similarity Search in High Dimensions

Within a mathematically rigorous model, we analyse the curse of dimensionality for deterministic exact similarity search in the context of popular indexing schemes: metric trees. The datasets $X$ are sampled randomly from a domain $Ω$, equipped with a distance, $ρ$, and an underlying probability distribution, $μ$. While performing an asymptotic analysis, we send the intrinsic dimension $d$ of $Ω$ to infinity, and assume that the size of a dataset, $n$, grows superpolynomially yet subexponentially in $d$. Exact similarity search refers to finding the nearest neighbour in the dataset $X$ to a query point $ω\inΩ$, where the query points are subject to the same probability distribution $μ$ as datapoints. Let $\mathscr F$ denote a class of all 1-Lipschitz functions on $Ω$ that can be used as decision functions in constructing a hierarchical metric tree indexing scheme. Suppose the VC dimension of the class of all sets $\{ω\colon f(ω)\geq a\}$, $a\in\R$ is $o(n^{1/4}/\log^2n)$. (In view of a 1995 result of Goldberg and Jerrum, even a stronger complexity assumption $d^{O(1)}$ is reasonable.) We deduce the $Ω(n^{1/4})$ lower bound on the expected average case performance of hierarchical metric-tree based indexing schemes for exact similarity search in $(Ω,X)$. In paricular, this bound is superpolynomial in $d$.

preprint2012arXivOpen access

Signal facts

What is known right now

Open access1 author1 topic

Next steps

Decide what to do with this paper

Use like or dislike for the fast social read. The more specific scholarly feedback stays available below when needed.

Log in to curate

Reading frame

Keep the important context close to the paper

Keep the important signals around this paper in one place: votes, save state, collection context, reviews and the metadata you need before deciding what to do next.

Institutions

Add specific reaction

Move through the context

Research map

Open full explorer

Move through nearby people, institutions, topics and adjacent work without leaving the paper page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

Structured reviews

0 review(s)

ContributeLeave structured feedbackUse the review template when you have a concrete strength, concern or method question.Open review form

No structured reviews yet. High-signal critique starts here.

Work discussion

0 comment(s)

DiscussAdd a high-signal commentKeep quick notes, caveats and replication pointers separate from formal reviews.Open comment form

No discussion yet. The first strong comment sets the tone.