Impact of Missing Data on Phylogenies Inferred from Empirical Phylogenomic Data Sets; Baurain, Denis ; in Molecular Biology and Evolution (2013), 30(1), 197-214 Progress in sequencing technology allows researchers to assemble ever-larger supermatrices for phylogenomic inference. However, current phylogenomic studies often rest on patchy data sets, with some ... [more ▼] Progress in sequencing technology allows researchers to assemble ever-larger supermatrices for phylogenomic inference. However, current phylogenomic studies often rest on patchy data sets, with some having 80% missing (or ambiguous) data or more. Though early simulations had suggested that missing data per se do not harm phylogenetic inference when using sufficiently large data sets, Lemmon et al. (Lemmon AR, Brown JM, Stanger-Hall K, Lemmon EM. 2009. The effect of ambiguous data on phylogenetic estimates obtained by maximum likelihood and Bayesian inference. Syst Biol. 58:130-145.) have recently cast doubt on this consensus in a study based on the introduction of parsimony-uninformative incomplete characters. In this work, we empirically reassess the issue of missing data in phylogenomics while exploring possible interactions with the model of sequence evolution. First, we note that parsimony-uninformative incomplete characters are actually informative in a probabilistic framework. A reanalysis of Lemmon's data set with this in mind gives a very different interpretation of their results and shows that some of their conclusions may be unfounded. Second, we investigate the effect of the progressive introduction of missing data in a complete supermatrix (126 genes × 39 species) capable of resolving animal relationships. These analyses demonstrate that missing data perturb phylogenetic inference slightly beyond the expected decrease in resolving power. In particular, they exacerbate systematic errors by reducing the number of species effectively available for the detection of multiple substitutions. Consequently, large sparse supermatrices are more sensitive to phylogenetic artifacts than smaller but less incomplete data sets, which argue for experimental designs aimed at collecting a modest number (∼50) of highly covered genes. Our results further confirm that including incomplete yet short-branch taxa (i.e., slowly evolving species or close outgroups) can help to eschew artifacts, as predicted by simulations. Finally, it appears that selecting an adequate model of sequence evolution (e.g., the site-heterogeneous CAT model instead of the site-homogeneous WAG model) is more beneficial to phylogenetic accuracy than reducing the level of missing data. [less ▲] Detailed reference viewed: 20 (3 ULg) The African coelacanth genome provides insights into tetrapod evolution.; ; et al in Nature (2013), 496(7445), 311-6 The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably ... [more ▼] The discovery of a living coelacanth specimen in 1938 was remarkable, as this lineage of lobe-finned fish was thought to have become extinct 70 million years ago. The modern coelacanth looks remarkably similar to many of its ancient relatives, and its evolutionary proximity to our own fish ancestors provides a glimpse of the fish that first walked on land. Here we report the genome sequence of the African coelacanth, Latimeria chalumnae. Through a phylogenomic analysis, we conclude that the lungfish, and not the coelacanth, is the closest living relative of tetrapods. Coelacanth protein-coding genes are significantly more slowly evolving than those of tetrapods, unlike other genomic features. Analyses of changes in genes and regulatory elements during the vertebrate adaptation to land highlight genes involved in immunity, nitrogen excretion and the development of fins, tail, ear, eye, brain and olfaction. Functional assays of enhancers involved in the fin-to-limb transition and in the emergence of extra-embryonic tissues show the importance of the coelacanth genome as a blueprint for understanding tetrapod evolution. [less ▲] Detailed reference viewed: 14 (2 ULg) Resolving difficult phylogenetic questions: why more sequences are not enough.; ; et al in PLoS Biology (2011), 9(3), 1000602 Detailed reference viewed: 68 (23 ULg) A phylogenomic falsification of the chromalveolate hypothesisBaurain, Denis ; ; et alPoster (2010, July) Detailed reference viewed: 75 (10 ULg) Phylogenomic evidence for separate acquisition of plastids in cryptophytes, haptophytes, and stramenopilesBaurain, Denis ; ; et alin Molecular Biology and Evolution (2010), 27(7), 1698-709 According to the chromalveolate hypothesis (Cavalier-Smith T. 1999. Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origins and the ... [more ▼] According to the chromalveolate hypothesis (Cavalier-Smith T. 1999. Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origins and the eukaryote family tree. J Eukaryot Microbiol 46:347-366), the four eukaryotic groups with chlorophyll c-containing plastids originate from a single photosynthetic ancestor, which acquired its plastids by secondary endosymbiosis with a red alga. So far, molecular phylogenies have failed to either support or disprove this view. Here, we devise a phylogenomic falsification of the chromalveolate hypothesis that estimates signal strength across the three genomic compartments: If the four chlorophyll c-containing lineages indeed derive from a single photosynthetic ancestor, then similar amounts of plastid, mitochondrial, and nuclear sequences should allow to recover their monophyly. Our results refute this prediction, with statistical support levels too different to be explained by evolutionary rate variation, phylogenetic artifacts, or endosymbiotic gene transfer. Therefore, we reject the chromalveolate hypothesis as falsified in favor of more complex evolutionary scenarios involving multiple higher order eukaryote-eukaryote endosymbioses. [less ▲] Detailed reference viewed: 40 (14 ULg) Current approaches to phylogenomic reconstructionBaurain, Denis ; in Caetano-Anollés, Gustavo (Ed.) Evolutionary Genomics and Systems Biology (2010) Detailed reference viewed: 31 (5 ULg) Lack of resolution in the animal phylogeny: Closely spaced cladogeneses or undetected systematic errors?Baurain, Denis ; ; in Molecular Biology and Evolution (2007), 24(1), 6-9 A recent phylogenomic study reported that the animal phylogeny was unresolved despite the use of 50 genes. This lack of resolution was interpreted as "a positive signature of closely spaced cladogenetic ... [more ▼] A recent phylogenomic study reported that the animal phylogeny was unresolved despite the use of 50 genes. This lack of resolution was interpreted as "a positive signature of closely spaced cladogenetic events." Here, we propose that this lack of resolution is rather due to the mutual cancellation of the phylogenetic signal (historical) and the nonphylogenetic signal (due to systematic errors) that results from inadequate taxon sampling and/or model of sequence evolution. Starting with a data set of comparable size, we use 3 different strategies to reduce the nonphylogenetic signal: 1) increasing the number of species; 2) replacing a fast-evolving species by a slowly evolving one; and 3) using a better model of sequence evolution. In all cases, the phylogenetic resolution is markedly improved, in agreement with our hypothesis that the originally reported lack of resolution was artifactual. [less ▲] Detailed reference viewed: 21 (7 ULg) Phylogenomics: how far back in the past can we go?; Baurain, Denis ; in Pudritz, Ralph; Higgs, Paul; Stone, Jonathan (Eds.) Planetary Systems and the Origins of Life (2007) Detailed reference viewed: 9 (2 ULg) Phylogénomique des lignées photosynthétiquesBaurain, Denis ; ; Scientific conference (2006, December 22) Detailed reference viewed: 11 (3 ULg) Animal evolution — A fully-resolved phylogenomic tree argues against the Cambrian explosion hypothesis; ; Baurain, Denis ![]() Poster (2006, March) Detailed reference viewed: 24 (4 ULg) The animal phylogeny and the fundamental importance of taxon sampling; ; Baurain, Denis ![]() Scientific conference (2006, February 20) Detailed reference viewed: 13 (2 ULg) Vertebrate origins: does the tunic make the man?; Baurain, Denis ; in Medecine Sciences : M/S (2006), 22(8-9, AUG-SEP), 688-690 Detailed reference viewed: 5 (0 ULg) |
||