Reference : Raw genotypes vs haplotype blocks for genome wide association studies by random forests
Scientific congresses and symposiums : Paper published in a book
Engineering, computing & technology : Multidisciplinary, general & others
http://hdl.handle.net/2268/28239
Raw genotypes vs haplotype blocks for genome wide association studies by random forests
English
Botta, Vincent[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Hansoul, Sarah[Université de Liège - ULg > Département de productions animales > GIGA-R : Génomique animale >]
Geurts, Pierre[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Wehenkel, Louis[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Sep-2008
Proc. of MLSB 2008, second workshop on Machine Learning in Systems Biology
No
International
[en] bioinformatics ; machine learning ; genetics
We consider two different representations of the input data for genome-wide association studies using random forests, namely raw genotypes described by a few thousand to a few hundred thousand discrete variables each one describing a single nucleotide polymorphism, and haplotype block contents, represented by the combinations of about 10 to 100 adjacent and correlated genotypes. We adapt random forests to exploit haplotype blocks, and compare this with the use of raw genotypes, in terms of predictive power and localization of causal mutations, by using simulated datasets with one or two interacting effects.