Unpublished conference/Abstract (Scientific congresses and symposiums)
Combining Genotype with LD-based haplotype information as input for iterative pruning principal component analysis (ipPCA) to improve population clustering
Chaichoompu, Kridsadakorn; Fouladi, Ramouna; Wangkumhang, Pongsakorn et al.
2014Capita Selecta in Complex Disease Analysis conference and EU COST Pancreas annual meeting (CSCDA 2014)
 

Files


Full Text
presentation_kridsadakorn_CSCDA2014.pdf
Author preprint (24.8 MB)
Request a copy

All documents in ORBi are protected by a user license.

Send to



Details



Abstract :
[en] Single Nucleotide Polymorphisms (SNPs) are commonly used to capture variations between populations and often genome-wide SNP data are pruned based on linkage disequilibrium (LD) patterns. To identify and differentiate between subpopulations using a rich set of genetic markers, as using reduced sets of genetic markers for these purposes, can become challenging especially when similar geographic regions are involved or when spurious patterns are likely to exist. Notably, haplotype composition and the pattern of LD between markers may vary between larger populations but may also play a role within more confined geographic regions. Indeed, the structure of haplotypes in unrelated individuals can reveal useful information about genetic ancestry. Here, we use iterative pruning principal component analysis (ipPCA) [1] to identify and characterize subpopulations in an unsupervised way. Furthermore, we purpose to combine an LD-based haplotype encoding scheme with the ipPCA machinery to retrieve fine population substructures. Despite the complexities that are associated with haplotype inference, added value can be obtained when the LD structure between SNPs is exploited in the search for relevant population strata. As input data, either pruned genome-wide SNP data are used or multilocus haplotype information derived from the genome-wide SNP panel. Preliminary results indicate that ipPCA applied to pruned SNP data or ipPCA that explicitly uses multilocus information (haplotypes) give complementary information about population substructure for geographically confined populations. In fact, both methods address different aspects of population structure. [1] Intarapanich, A. et al. (2009), BMC Bioinformatics. 10: p. 382.
Disciplines :
Life sciences: Multidisciplinary, general & others
Author, co-author :
Chaichoompu, Kridsadakorn ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Bioinformatique
Fouladi, Ramouna ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Bioinformatique
Wangkumhang, Pongsakorn;  National Center for Genetic Engineering and Biotechnology, Thailand > Genome Institute > Biostatistics and informatics Laboratory
Wilantho, Alisa;  National Center for Genetic Engineering and Biotechnology, Thailand > Genome Institute > Biostatistics and informatics Laboratory
Chareanchim, Wanwisa;  National Center for Genetic Engineering and Biotechnology, Thailand > Genome Institute > Biostatistics and informatics Laboratory
Shaw, Philip James;  National Center for Genetic Engineering and Biotechnology, Thailand > Medical Molecular Biology Research Unit > Protein-Ligand Engineering and Molecular Biology Laboratory
Tongsima, Sissades;  National Center for Genetic Engineering and Biotechnology, Thailand > Genome Institute > Biostatistics and informatics Laboratory
Sakuntabhai, Anavaj;  Institut Pasteur, France > Functional Genetics of Infectious Diseases Unit
Van Steen, Kristel  ;  Université de Liège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Bioinformatique
Language :
English
Title :
Combining Genotype with LD-based haplotype information as input for iterative pruning principal component analysis (ipPCA) to improve population clustering
Publication date :
26 November 2014
Event name :
Capita Selecta in Complex Disease Analysis conference and EU COST Pancreas annual meeting (CSCDA 2014)
Event place :
Liege, Belgium
Event date :
24-26 November 2014
Audience :
International
Name of the research project :
Foresting in Integromics Inference
Funders :
F.R.S.-FNRS - Fonds de la Recherche Scientifique [BE]
Available on ORBi :
since 20 July 2016

Statistics


Number of views
78 (3 by ULiège)
Number of downloads
0 (0 by ULiège)

Bibliography


Similar publications



Contact ORBi