Fundamenta Informaticae

Species-driven persistent phylogeny

View publication


The perfect phylogeny is a widely used model in phylogenetics, since it provides an effective representation of evolution of binary characters in several contexts, such as for example in haplotype inference. The model, which is conceptually the simplest among those actually used, is based on the infinite sites assumption, that is no character can mutate more than once in the whole tree. Since a large number of biological phenomena cannot be modeled by the perfect phylogeny, it becomes important to find generalizations that retain the computational tractability of the original model, but are more flexible in modeling biological data when the infinite site assumption is violated, e.g. because of back mutations. In this paper, we introduce a new model-called species-driven persistent phylogeny-and we study the relations between three different formulations: perfect phylogeny, persistent phylogeny, galled trees, and species-driven persistent phylogeny. The species-driven persistent phylogeny model is intermediate between the perfect and the persistent phylogeny, since a perfect phylogeny allows no back mutations and a persistent phylogeny allows each character to back mutate only once. We describe an algorithm to compute a species-driven persistent phylogeny and we prove that every matrix admitting a galled-tree also admits a species-driven persistent phylogeny.