Scientific conference in universities or research centers (Scientific conferences in universities or research centers)
Haplotype information combined with iterative pruning PCA (ipPCA) to improve population clustering
Chaichoompu, Kridsadakorn; Fouladi, Ramouna; Wangkumhang, Pongsakorn et al.
2014
 

Files


Full Text
presentation_kridsadakorn_emgm_04_2014.pdf
Author preprint (16.08 MB)
Presentation
Request a copy

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
PCA; population clustering; Haplotype
Abstract :
[en] Single Nucleotide Polymorphisms (SNPs) are commonly used to capture variations between populations. Often genome-wide SNP data are pruned based on linkage disequilibrium (LD) patterns or small subsets of SNPs are selected (e.g. PCA-correlated SNPs) to reproduce the genomic structure of the complete data set. Identifying and differentiating between subpopulations using such a reduced set can become challenging, especially when similar geographic regions are involved or when spurious patterns are likely to exist. Although PCA-based methods can resolve structure, they cannot infer ancestry. On the other hand, the structure of haplotypes in unrelated individuals can reveal useful information about genetic ancestry. Notably, haplotype composition and the pattern of LD between markers may vary between larger populations but may also play a role within more confined geographic regions. In addition, iterative pruning principal component analysis (ipPCA) has been shown to be a powerful tool to cluster subpopulations based on SNP profiles. Despite the complexities that are associated with haplotype inference, we argue that added value can be obtained when the LD structure between SNPs is exploited in the search for relevant population strata. In this work, we propose to combine an LD-based novel haplotype encoding scheme with the ipPCA machinery to retrieve fine population substructures. The approach is compared to state-of-the-art methods in the context of population substructure and admixture analysis.
Research center :
Systems and Modeling Unit, Montefiore Institute and Bioinformatics and Modeling, GIGA-R
Disciplines :
Life sciences: Multidisciplinary, general & others
Author, co-author :
Chaichoompu, Kridsadakorn ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Bioinformatique
Fouladi, Ramouna;  University of Liege > Montefiore Institute > Systems and Modeling Unit
Wangkumhang, Pongsakorn;  National Center for Genetic Engineering and Biotechnology > Genome Institute > Biostatistics and informatics Laboratory
Wilantho, Alisa;  National Center for Genetic Engineering and Biotechnology > Genome Institute > Biostatistics and informatics Laboratory
Chareanchim, Wanwisa;  National Center for Genetic Engineering and Biotechnology > Genome Institute > Systems and Modeling Unit
Tongsima, Sissades;  National Center for Genetic Engineering and Biotechnology > Genome Institute > Biostatistics and informatics Laboratory
Sakuntabhai, Anavaj;  Institut Pasteur > Functional Genetics of Infectious Diseases Unit
Van Steen, Kristel;  University of Liege > Montefiore Institute > Systems and Modeling Unit
Language :
English
Title :
Haplotype information combined with iterative pruning PCA (ipPCA) to improve population clustering
Publication date :
01 April 2014
Event name :
The 42nd European Mathematical Genetics Meeting 2014
Event organizer :
Statistical Genetics and Bioinformatics Group, Cologne Center for Genomics (CCG), University of Cologne
Event place :
Colonge, Germany
Event date :
from 01-04-2014 to 02-04-2014
Audience :
International
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique [BE]
Available on ORBi :
since 11 April 2014

Statistics


Number of views
168 (19 by ULiège)
Number of downloads
2 (2 by ULiège)

Bibliography


Similar publications



Contact ORBi