Defining binary phylogenetic trees using parsimony
Phylogenetic (i.e. leaf-labeled) trees play a fundamental role in evolutionary research. A typical problem is to reconstruct such trees from data like DNA alignments (whose columns are often referred to as characters), and a simple optimization criterion for such reconstructions is maximum parsimony. It is generally assumed that this criterion works well for data in which state changes are rare. In the present manuscript, we prove that each phylogenetic tree $T$ with $n\geq 20 k$ leaves is uniquely defined by the set $A_k(T)$, which consists of all characters with parsimony score $k$ on $T$. This can be considered as a promising first step towards showing that maximum parsimony as a tree reconstruction criterion is justified when the number of changes in the data is relatively small.