Content area

Abstract

Knowing how organisms are related evolutionarily is crucial for interpreting nearly all biological results. Evolutionary history is inferred using computational techniques that make simplifying assumptions about the evolutionary process. There is ample biological evidence that many of these assumptions are routinely violated, but little is known about the effects of assumption violations on phylogenetic inference. Here I show how site-specific changes in evolutionary rates---an important evolutionary feature not incorporated into phylogenetic models---can cause existing methods to produce incorrect results. I develop a mixed branch length technique that produces more reliable inferences under realistic conditions. I outline a strategy to reduce the computational demands of the mixed branch length model by code optimization and algorithm improvements.

Biologists also want to assess the confidence they should have in inferred phylogenies. Bayesian methods calculate posterior probabilities---i.e. the probability that a hypothesis is correct given the data, model, and prior probability distributions over model parameters for phylogenetic hypotheses, producing an intuitively meaningful measure of statistical confidence, but concerns that posterior probabilities may regularly be too high has hampered acceptance of phylogenies produced using Bayesian methods. Understanding if, when, and why posterior probabilities are inflated is a crucial problem.

Here I show that although posterior probabilities are by definition correct assessments of subjective confidence given prior assumptions, they are accurate statements of objective confidence only when branch lengths are known in advance. When branch lengths are unknown, posterior probabilities can be either higher or lower than the long-run chance a hypothesis is correct. Posterior probabilities reported on actual phylogenies should therefore be interpreted only from a subjectivist standpoint.

My results suggest that phylogenetic techniques can produce incorrect phylogenies and assessments of statistical confidence due to assumption violations. Incorporating knowledge of how evolution works at the biological level into phylogenetic models can improve the quality of evolutionary inferences. The mixed branch length model incorporates an important feature of molecular evolution, potentially generating more accurate phylogenies than existing techniques.

This dissertation includes both my previously published and my co-authored materials.

Details

Title
Deconstructing phylogenetic reconstruction: Effects of assumption violations on evolutionary inference
Author
Kolaczkowski, Bryan
Year
2006
Publisher
ProQuest Dissertations Publishing
ISBN
978-1-109-90278-5
Source type
Dissertation or Thesis
Language of publication
English
ProQuest document ID
305274093
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.