Poster (Scientific congresses and symposiums)
Encoded haplotype data as input to ipPCA can better resolve population clustering
Chaichoompu, Kridsadakorn; Pongsakorn, Wangkumhang; Anunchai, Assawamakin et al.
2012the 11th International Conference on Bioinformatics (InCoB2012)
 

Files


Full Text
poster_hapscan_ippca_mac.pdf
Author preprint (1.23 MB)
Request a copy

All documents in ORBi are protected by a user license.

Send to



Details



Keywords :
population clustering; iterative pruning principal component analysis; haplotype encoding; sliding window
Abstract :
[en] Background Studies in population genetics are mainly based on the analysis of genetic variations among different populations. With the advent of advanced genotyping technology, large number of Single Nucleotide Polymorphisms (SNPs) can be used to capture the underlying population variations. Iterative pruning principal component analysis (ipPCA) is a very powerful tool to cluster subpopulations based on their SNP profiles. However, when several similar populations are considered in the analysis, differentiating these populations can become very challenging. Haplotype has been known to capture more segregation information and higher power than SNP but due to high inference complexity, this concept has not been widely used. Recently, haplotype sharing (HS) was reported as a good alternative method to evaluate variation among populations. HS interrogates the entire genotyping without estimating haplotype block, making it computational efficient, yet retaining population profile. Adopting HS technique and introducing a new haplotype encoding as the input to ipPCA to perform population clustering can yield very good outcomes. Results In this study we transformed an indigenous Thai SNP genotyping data, obtained from Pan Asian SNP consortium, into encoded haplotype profiles. The dataset include 13 indigenous populations (245 individuals) composing of approximately 54K SNPs for each individual. To do this, an encoded haplotype matrix was constructed by inferring overlapping haplotype based on sliding window approach in BEAGLE, an efficient haplotype inference tool. We fed this encoded haplotype matrix to ipPCA to cluster these individuals into sub-groups using only their genetic profiles. We compared the results obtained from standard protocol of ipPCA with the one that use the encoded haplotype matrix in terms of numbers of clustered subpopulations as well as the accuracy to correctly assign an individual to a correct subpopulation. Using the encoded haplotype matrix as input to ipPCA rendered the exact 13 subpopulations to be clustered with 99.18% of individual assignment accuracy, whereas the conventional ipPCA identified only 10 subpopulations with 93.47% of individual assignment accuracy. Conclusions Our result demonstrated the great potential of using the encoded haplotype matrix with ipPCA for population genetics studies. This new protocol can promote the clustering of individuals using only their genetic profiles.
Disciplines :
Life sciences: Multidisciplinary, general & others
Author, co-author :
Chaichoompu, Kridsadakorn ;  Université de Liège - ULiège > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Bioinformatique
Pongsakorn, Wangkumhang;  Biostatistics and informatics Laboratory, Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand
Anunchai, Assawamakin;  Biostatistics and informatics Laboratory, Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand
Sissades, Tongsima;  Biostatistics and informatics Laboratory, Genome Institute, National Center for Genetic Engineering and Biotechnology, Thailand
Language :
English
Title :
Encoded haplotype data as input to ipPCA can better resolve population clustering
Publication date :
October 2012
Event name :
the 11th International Conference on Bioinformatics (InCoB2012)
Event place :
Bangkok, Thailand
Event date :
03-10-2012 to 05-10-2012
Audience :
International
Available on ORBi :
since 31 January 2014

Statistics


Number of views
90 (6 by ULiège)
Number of downloads
2 (2 by ULiège)

Bibliography


Similar publications



Contact ORBi