TreeFix: Statistically Informed Gene Tree Error Correction Using Species Trees
Citations Over TimeTop 10% of 2012 papers
Abstract
Accurate gene tree reconstruction is a fundamental problem in phylogenetics, with many important applications. However, sequence data alone often lack enough information to confidently support one gene tree topology over many competing alternatives. Here, we present a novel framework for combining sequence data and species tree information, and we describe an implementation of this framework in TreeFix, a new phylogenetic program for improving gene tree reconstructions. Given a gene tree (preferably computed using a maximum-likelihood phylogenetic program), TreeFix finds a "statistically equivalent" gene tree that minimizes a species tree-based cost function. We have applied TreeFix to 2 clades of 12 Drosophila and 16 fungal genomes, as well as to simulated phylogenies and show that it dramatically improves reconstructions compared with current state-of-the-art programs. Given its accuracy, speed, and simplicity, TreeFix should be applicable to a wide range of analyses and have many important implications for future investigations of gene evolution. The source code and a sample data set are available at http://compbio.mit.edu/treefix.
Related Papers
- → Introduction to Phylogenetic Analysis of Molecular Sequence Data(2023)4 cited
- Construction of phylogenetic tree based on DNA sequences(2008)
- Computing Measures for Tree-Basedness of Phylogenetic Networks(2018)
- Covering Tree-Based Phylogenetic Networks.(2020)
- → Mathematical Inference and Application of Expectation-Maximization Algorithm in the Construction of Phylogenetic Tree(2013)