Graph explorer

The similarity metric

A new class of distances appropriate for measuring similarity relations between sequences, say one type of similarity per distance, is studied. We propose a new ``normalized information distance'', based on the noncomputable notion of Kolmogorov complexity, and show that it is in this class and it minorizes every computable distance in the class (that is, it is universal in that it discovers all computable similarities). We demonstrate that it is a metric and call it the {\em similarity metric}. This theory forms the foundation for a new practical tool. To evidence generality and robustness we give two distinctive applications in widely divergent areas using standard compression programs like gzip and GenCompress. First, we compare whole mitochondrial genomes and infer their evolutionary history. This results in a first completely automatic computed whole mitochondrial phylogeny tree. Secondly, we fully automatically compute the language tree of 52 different languages.

14 nodes20 linksoverview previewThe similarity metric
14 nodes20 links
The similarity metric14 visible / 14 total nodes / 30 links
Related contextRelated contextCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipCo-authorshipRelated contextAuthorshipAuthorshipAuthorshipAuthorshipTopic signalTopic signalTopic signalTopic signalTopic signalTopic signalTopic signalTopic signalRelated contextRelated contextRelated contextRelated contextAuthorshipWThe similarity metricpreprint / 2004AMing LiResearcherAXin ChenResearcherAXin LiResearcherABin MaResearcherTComputer Vision30606 worksTmath.CO8936 worksTcond-mat.stat-mech6570 worksTmath.ST3384 worksTStatistics Theory3281 worksTphysics.data-an1229 worksTComputational Engineeri...1260 worksTComputational Complexity1354 worksAPaul VitanyiResearcher
PaperSignal 1013 links

The similarity metric

preprint / 2004

Open