References of "Gusareva, Elena"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailA cautionary note on the impact of protocol changes for Genome-Wide Association SNP x SNP Interaction studies: an example on ankylosing spondylitis
Bessonov, Kyrylo ULg; Gusareva, Elena ULg; Van Steen, Kristel ULg

in Human Genetics (2015)

Genome-wide association interaction (GWAI) studies have increased in popularity. Yet to date, no standard protocol exists. In practice, any GWAI workflow involves making choices about quality control ... [more ▼]

Genome-wide association interaction (GWAI) studies have increased in popularity. Yet to date, no standard protocol exists. In practice, any GWAI workflow involves making choices about quality control strategy, SNP filtering, linkage disequilibrium (LD) pruning, analytic tool to model or to test for genetic interactions. Each of these can have an impact on the final epistasis findings and may affect their reproducibility in follow-up analyses. Choosing an analytic tool is not straightforward, as different such tools exist and current understanding about their performance is based on often very particular simulation settings. In the present study, we wish to create awareness for the impact of (minor) changes in a GWAI analysis protocol can have on final epistasis findings. In particular, we investigate the influence of marker selection and marker prioritization strategies, LD pruning and the choice of epistasis detection analytics on study results, giving rise to 8 GWAI protocols. Discussions are made in the context of the ankylosing spondylitis (AS) data obtained via the Wellcome Trust Case Control Consortium (WTCCC2). As expected, the largest impact on AS epistasis findings is caused by the choice of marker selection criterion, followed by marker coding and LD pruning. In MB-MDR, co-dominant coding of main effects is more robust to the effects of LD pruning than additive coding. We were able to reproduce previously reported epistasis involvement of HLA-B and ERAP1 in AS pathology. In addition, our results suggest involvement of MAGI3 and PARK2, responsible for cell adhesion and cellular trafficking. Gene Ontology (GO) biological function enrichment analysis across the 8 considered GWAI protocols also suggested that AS could be associated to the Central Nervous System (CNS) malfunctions, specifically, in nerve impulse propagation and in neurotransmitters metabolic processes. [less ▲]

Detailed reference viewed: 57 (14 ULg)
Full Text
Peer Reviewed
See detailHigh-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitis.
Goyette, Philippe; Boucher, Gabrielle; Mallon, Dermot et al

in Nature Genetics (2015), 47(2), 172-9

Genome-wide association studies of the related chronic inflammatory bowel diseases (IBD) known as Crohn's disease and ulcerative colitis have shown strong evidence of association to the major ... [more ▼]

Genome-wide association studies of the related chronic inflammatory bowel diseases (IBD) known as Crohn's disease and ulcerative colitis have shown strong evidence of association to the major histocompatibility complex (MHC). This region encodes a large number of immunological candidates, including the antigen-presenting classical human leukocyte antigen (HLA) molecules. Studies in IBD have indicated that multiple independent associations exist at HLA and non-HLA genes, but they have lacked the statistical power to define the architecture of association and causal alleles. To address this, we performed high-density SNP typing of the MHC in >32,000 individuals with IBD, implicating multiple HLA alleles, with a primary role for HLA-DRB1*01:03 in both Crohn's disease and ulcerative colitis. Noteworthy differences were observed between these diseases, including a predominant role for class II HLA variants and heterozygous advantage observed in ulcerative colitis, suggesting an important role of the adaptive immune response in the colonic environment in the pathogenesis of IBD. [less ▲]

Detailed reference viewed: 37 (7 ULg)
Full Text
Peer Reviewed
See detailPractical aspects of genome-wide association interaction analysis.
Gusareva, Elena ULg; Van Steen, Kristel ULg

in Human Genetics (2014), 133(11), 1343-58

Large-scale epistasis studies can give new clues to system-level genetic mechanisms and a better understanding of the underlying biology of human complex disease traits. Though many novel methods have ... [more ▼]

Large-scale epistasis studies can give new clues to system-level genetic mechanisms and a better understanding of the underlying biology of human complex disease traits. Though many novel methods have been proposed to carry out such studies, so far only a few of them have demonstrated replicable results. Here, we propose a minimal protocol for genome-wide association interaction (GWAI) analysis to identify gene–gene interactions from large-scale genomic data. The different steps of the developed protocol are discussed and motivated, and encompass interaction screening in a hypothesis-free and hypothesisdriven manner. In particular, we examine a wide range of aspects related to epistasis discovery in the context of complex traits in humans, hereby giving practical recommendations for data quality control, variant selection or prioritization strategies and analytic tools, replication and meta-analysis, biological validation of statistical findings and other related aspects. The minimal protocol provides guidelines and attention points for anyone involved in GWAI analysis and aims to enhance the biological relevance of GWAI findings. At the same time, the protocol improves a better assessment of strengths and weaknesses of published GWAI methodologies. [less ▲]

Detailed reference viewed: 18 (3 ULg)
Full Text
Peer Reviewed
See detailGenome-Wide Association Interaction Analysis for Alzheimer’s Disease
Gusareva, Elena ULg; Carrasquillo, Minerva M.; Bellenguez, Céline et al

in Neurobiology of Aging (2014)

We propose a minimal protocol for exhaustive genome-wide association interaction analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and ... [more ▼]

We propose a minimal protocol for exhaustive genome-wide association interaction analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and statistical tools. The different steps of this protocol are illustrated on a real-life data application for Alzheimer's disease (AD) (2259 patients and 6017 controls from France). Particularly, in the exhaustive genome-wide epistasis screening we identified AD-associated interacting SNPs-pair from chromosome 6q11.1 (rs6455128, the KHDRBS2 gene) and 13q12.11 (rs7989332, the CRYL1 gene) (p = 0.006, corrected for multiple testing). A replication analysis in the independent AD cohort from Germany (555 patients and 824 controls) confirmed the discovered epistasis signal (p = 0.036). This signal was also supported by a meta-analysis approach in 5 independent AD cohorts that was applied in the context of epistasis for the first time. Transcriptome analysis revealed negative correlation between expression levels of KHDRBS2 and CRYL1 in both the temporal cortex (β = -0.19, p = 0.0006) and cerebellum (β = -0.23, p < 0.0001) brain regions. This is the first time a replicable epistasis associated with AD was identified using a hypothesis free screening approach. [less ▲]

Detailed reference viewed: 40 (10 ULg)
Full Text
Peer Reviewed
See detailGenetic regulation of immunoglobulin E level in different pathological states: integration of mouse and human genetics
Gusareva, Elena ULg; Kurey, Irina; Grekov, Igor et al

in Biological Reviews of the Cambridge Philosophical Society (2014), 89(2), 375-405

Immunoglobulin E (IgE) first evolved in mammals. It plays an important role in defence against helminths and parasitic infection and in pathological states including allergic reactions, anti-tumour ... [more ▼]

Immunoglobulin E (IgE) first evolved in mammals. It plays an important role in defence against helminths and parasitic infection and in pathological states including allergic reactions, anti-tumour defence and autoimmune diseases. Elucidation of genetic control of IgE level could help us to understand regulation of the humoral immune response in health and disease, the etiology and pathogenesis of many human diseases, and to facilitate discovery of more effective methods for their prevention and cure. Herein we summarise progress in the genetics of regulation of IgE level in human diseases and show that integration of different approaches and use of animal models have synergistic effects in gaining new knowledge about both protective and pathological roles of this important antibody. [less ▲]

Detailed reference viewed: 23 (3 ULg)
Full Text
See detailGenome-wide association interaction analysis for Alzheimer’s disease.
Gusareva, Elena ULg; Bellenguez, C; Cuyvers, E et al

Poster (2014, January 27)

Detailed reference viewed: 7 (0 ULg)
Full Text
See detailGenome-wide environmental interaction analysis using multidimensional data reduction principles to identify asthma pharmacogenetic loci in relation to corticosteroid therapy
Van Lishout, François ULg; Bessonov, Kyrylo ULg; Duan, Quingling et al

Poster (2013, October 25)

Genome-wide gene-environment (GxE) and gene-gene (GxG) interaction studies share a lot of challenges via the common genetic component they involve. GWEI studies may therefore benefit from the abundance of ... [more ▼]

Genome-wide gene-environment (GxE) and gene-gene (GxG) interaction studies share a lot of challenges via the common genetic component they involve. GWEI studies may therefore benefit from the abundance of methodologies that are available in the context of genome-wide epistasis detection methods. One of these is Model-Based Multifactor Dimensionality Reduction (MB-MDR), which does not make any assumption about the genetic inheritance model. MB-MDR involves reducing a high-dimensional GxE space to GxE factor levels that either exhibit high or low or no evidence for their association to disease outcome. In contrast to logistic regression and random forests, MB-MDR can be used to detect GxE interactions in the absence of any main effects or when sample sizes are too small to be able to model all main and GxE interaction effects. In this ongoing study, we demonstrate the opportunities and challenges of MB-MDR for genome-wide GxE interaction analysis and analyzed the difference in prebronchodilator FEV1 following 8 weeks of inhaled corticosteroid therapy, for 565 pediatric Caucasian CAMP (ages 5-12) from the SHARE project. [less ▲]

Detailed reference viewed: 50 (11 ULg)
Full Text
Peer Reviewed
See detailGenome-wide association interaction analysis for Alzheimer’s disease.
Gusareva, Elena ULg; Bellenguez, C; Cuyvers, E et al

Poster (2013, October)

Identification of epistasis is a challenging task that when successful gives new clues to systems-level genetics where the complexity of underling biology of human disease can be better understood. Though ... [more ▼]

Identification of epistasis is a challenging task that when successful gives new clues to systems-level genetics where the complexity of underling biology of human disease can be better understood. Though many novel methods for detecting epistasis have been proposed and many studies for epistasis detection have been conducted, so far few studies can demonstrate replicable epistasis. In the present work, we propose a minimal protocol for exhaustive genome-wide association interaction (GWAI) analysis that involves screening for epistasis over large-scale genomic data combining strengths of different methods and statistical tools. The different steps of this protocol are illustrated on a real-life data application for Alzheimer’s disease (a large cohort of 2259 patients and 6017 controls from France). Using this protocol, we identified AD-associated interacting SNPs-pair from chromosome 6q11.1 (rs6455128, the KHDRBS2 gene) and 13q12.11 (rs7989332, the CRYL1 gene) and male-specific epistasis between SNPs from chromosome 5q34 (rs729149 and rs3733980, the WWC1 gene) and 15q22.2 (rs9806612, rs9302230 and rs7175766, the TLN2 gene). The transcriptome analysis revealed negative correlation between expression levels of KHDRBS2 and CRYL1 in both the temporal cortex and cerebellum brain regions and positive correlation between the expression levels of CRYL1 and WWC1 in the temporal cortex brain region. A replication analysis strategy and a meta-analysis approach in independent data confirmed effects of some of the discovered interactions. [less ▲]

Detailed reference viewed: 7 (0 ULg)
Full Text
Peer Reviewed
See detailAn efficient algorithm to perform multiple testing in epistasis screening
Van Lishout, François ULg; Mahachie John, Jestinah ULg; Gusareva, Elena ULg et al

in BMC Bioinformatics (2013), 14

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved ... [more ▼]

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. Gene-gene interaction studies will require a memory proportional to the squared number of SNPs. A genome-wide epistasis search would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MBMDR-3.0.3. We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn's disease. Results: In the case of a binary (affected/unaffected) trait, the parallel workflow of MBMDR-3.0.3 analyzes all gene-gene interactions with a dataset of 100,000 SNPs typed on 1000 individuals within 4 days and 9 hours, using 999 permutations of the trait to assess statistical significance, on a cluster composed of 10 blades, containing each four Quad-Core AMD Opteron Processor 2352 2.1 GHz. In the case of a continuous trait, a similar run takes 9 days. Our program found 14 SNP-SNP interactions with a multiple-testing corrected p-value of less than 0.05 on real-life Crohn's disease data. Conclusions: Our software is the first implementation of the MB-MDR methodology able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory, while adequately controlling the type I error rates. A new implementation to reach genome-wide epistasis screening is under construction. In the context of Crohn's disease, MBMDR-3.0.3 could identify epistasis involving regions that are well known in the field and could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype higher-order associations. [less ▲]

Detailed reference viewed: 58 (17 ULg)
Full Text
Peer Reviewed
See detailGenome-wide association interaction analysis for complex diseases: an example on Alzheimer’s disease.
Gusareva, Elena ULg; Cuyvers, E; Colon, S et al

Poster (2013, April)

Objectives: Common genetic mutations that can be detected via a genome-wide association (GWA) study and at the same time have a strong contribution to disease risk are fairly limited. Some of the genetic ... [more ▼]

Objectives: Common genetic mutations that can be detected via a genome-wide association (GWA) study and at the same time have a strong contribution to disease risk are fairly limited. Some of the genetic variants in humans are either rare, thus more difficult to be identified, or they are common, but exert relatively small or even no individual effects that are masked or enhanced by one or several genes. The discovery of interacting genetic variants, possibly explaining part of the hidden genetic heritability, requires the development of sophisticated strategies and bioinformatics tools. Methods: In the present study, we propose a minimal protocol for genome-wide association interaction (GWAI) analysis that involves screening over large-scale genomic data in the search for epistatic or synergetic effects. The different steps of this minimal protocol are illustrated on a real-life data application for Alzheimer disease (AlzD) (large human cohort of 2,259 cases and 6,017 controls from France) and the pros and cons of the approaches are discussed. Results: Using the protocol, we identified two pairs of AlzD-associated interacting SNPs: from chromosome 6q11.1 and 13q12.11 and male-specific epistasis between SNPs from chromosome 5q34 and 15q22.2. Conclusion: In the present work we developed and applied an epistasis detection protocol to perform a comprehensive genome-wide search for AlzD-associated epistatic effects, hereby combining the strengths of different strategies, methods and statistical tools. It is the first time an epistasis study of this magnitude has been conducted in the context of AlzD. We show the advantages of viewing and analyzing data from different angles. A replication analysis strategy adapted to the epistasis detection context, as well as a meta-analytic approach confirmed effects of the discovered interactions. Apart from the biological and clinical importance, the present work offers a roadmap for future investigations in the field of epistasis detection and interpretation. [less ▲]

Detailed reference viewed: 11 (0 ULg)
Full Text
Peer Reviewed
See detailGenome-wide Epistasis Screening for Alzheimer‘s Disease
Gusareva, Elena ULg; Bellenguez, C; Sleegers, K et al

Poster (2013, March)

Objectives: Alzheimer disease (AlzD) is a complex, progressive neurodegenerative disease where dementia symptoms (memory and other intellectual abilities loss) gradually worsen over a number of years. The ... [more ▼]

Objectives: Alzheimer disease (AlzD) is a complex, progressive neurodegenerative disease where dementia symptoms (memory and other intellectual abilities loss) gradually worsen over a number of years. The disease is characterized by the neuropathologic findings of neurofibrillary tangles and amyloid plaques that accumulate in vulnerable brain regions. AlzD is inherited as complex trait and appears to be highly heritable with 58-79 percent attributable to genetic factors. So far, although a number of main-effect genes have been identified, only a fraction of AlzD cases can be explained by specific gene mutations. In our study we performed an exhaustive and selective genome-wide screening for SNP-SNP interactions associated with AlzD in a large case/control cohort to reveal hidden heritability that can be accounted for by epistasis. Methods: We developed a minimal protocol for genome-wide association interaction (GWAI) analysis that involves screening over large-scale genomic data in the search for epistatic or synergetic effects. The protocol was applied on a large human cohort of 2,259 cases AlzD cases and 6,017 healthy controls from France to search for AlzD-associated epistasic effects. Results: In the exhaustive genome-wide screening, we identified two pairs of AlzD-associated interacting SNPs from chromosomes 6q11.1 and 13q12.11, and male-specific epistasis between SNPs from chromosomes 5q34 and 15q22.2. In the selective epistasis search, screening over the candidate genes for AlzD previously reported to be in interaction, we replicated seven out of twelve AlzD-associated gene pairs (INS / PPARA, IL1A / PPARA, IL10 / PPARA, TF / HFE, MTHFR / IL6, ABCA1 / NPC1, LRP1 / MAPT). Conclusion: It is the first time an epistasis study of this magnitude has been conducted in the context of AlzD. We show the advantages of viewing and analyzing data from different angles. A replication analysis strategy adapted to the epistasis detection context, as well as a meta-analytic approach confirmed effects of the discovered interactions. Apart from the biological and clinical importance, the present work offers a roadmap for future investigations in the field of epistasis detection and interpretation. [less ▲]

Detailed reference viewed: 12 (0 ULg)
Full Text
Peer Reviewed
See detailIntegrating biological information in genome-wide association interaction (GWAI) studies.
Gusareva, Elena ULg; Van Steen, Kristel ULg

Poster (2012, May 30)

Genome-wide association (GWA) studies of Crohn's disease have identified numerous genes. However, a substantial portion of the heritability of this disease remains unexplained. Some gene variants, not ... [more ▼]

Genome-wide association (GWA) studies of Crohn's disease have identified numerous genes. However, a substantial portion of the heritability of this disease remains unexplained. Some gene variants, not detectable via main effects GWA study, may manifest themselves only in interaction with other variants. To search for interacting genes involved in the regulation of Crohn's disease, we performed GWA epistasis screening in a large human cohort (1851 cases/2938 controls) belonging to the Wellcome Trust Case Control Consortium (WTCCC). All subjects were genotyped with the GeneChip 500K Mapping Array Set (Affymetrix chip). SNPs that passed our quality control (359,479 SNPs) were processed in Biofilter (a software package that looks for candidate epistatic genes contributing to disease risk) giving rise to 14,185 SNPs. Subsequent MB-MDR epistasis screening, highlighted evidence for statistical epistasis for 6 SNP pairs. Four of these pairs involve interaction on between the HTR3B and USP28 genes (chromosome 11). One pair involves an interaction between the SLED1 and ACSL1 genes (chromosome 4). Another pair involves an interaction between PTPN22 (chromosome 1) and a SNP located between USPL1 and ALOX5AP (chromosome 13). Notably, when mapped to the same genomic regions, the interacting SNPs were not in linkage disequilibrium (r^2 < 0.11). All results were corrected for potential main SNP effects and were corrected for multiple testing on 7072 SNPs (i.e., 25,003,056 pairs). We investigated the consistency of these findings using different methodologies, including information-based, LD comparison-based and regression-based approaches. We also investigated the utility of using prior information from biological interaction data bases in novel epistasis discoveries, as well as the utility of biological data bases in interpreting a posterior identified statistical epistasis pairs. Our findings may provide new leads on important pathways involved in Crohn's disease. [less ▲]

Detailed reference viewed: 4 (0 ULg)
Peer Reviewed
See detailApplication of mixed polygenic model to control for cryptic/genuine relatedness and population stratification.
Gusareva, Elena ULg; Mahachie John, Jestinah ULg; Isaacs, Aaron et al

Poster (2012, March 12)

In genome-wide association studies (GWAs), population stratification may cause inflated type I errors and overly-optimistic test results, when not properly corrected for. During the past decade, several ... [more ▼]

In genome-wide association studies (GWAs), population stratification may cause inflated type I errors and overly-optimistic test results, when not properly corrected for. During the past decade, several methods have been proposed for association testing in the presence of population stratification. Among these, principal components-based approaches are the most popular. Principal component analysis (PCA) allows data transformation to a new coordinate system such that the projection of the data along the first new coordinate (called the PC1) has the largest variance; the second PC has the second largest variance, and so on. In practice, two components are usually enough to adjust or to control for population stratification. They can easily be included in parametric association models as covariates. Despite the success of this strategy, there are still some caveats which need further attention. Among these are that principal component-based methods generally do not account for cryptic relatedness (kinship) between supposedly unrelated individuals, are not straightforwardly adapted to accommodate family-based designs or mixtures of families and unrelated individuals, and do not always take proper account of the trait under investigation. In this work, we present an easy-to-use alternative that addresses the aforementioned issues. For quantitative traits, we propose to first use the mixed polygenic model (possibly taking into account important non-genetic confounders as covariates), second to derive “polygenic” residuals from this model – hereby removing genomic kinship relationships, and third to consider these residuals as new traits in a classical genome-wide QTL analysis for “unrelated individuals”. The polygenic component of the aforementioned mixed polygenic model describes the contribution from multiple independently segregating genes, all having a small additive effect on the trait under investigation. Via an extensive simulation study, with various settings of population stratification and admixture, we show that this approach not only removes most of the “relatedness” between individuals (cryptic relatedness or known relatedness), but also removes most of the remaining substructures caused by population stratification or admixture. As a proof of concept, we demonstrate the efficiency of this robust method to control for population stratification on real-life genome-scale data from the SNP Health Association Resource (SHARe) Asthma Resource project (SHARP) (dbGaP accession number phs000166.v2.p1). We also provide leads to extend this method to dichotomous traits. [less ▲]

Detailed reference viewed: 60 (16 ULg)
Full Text
Peer Reviewed
See detailLower-Order Effects Adjustment in Quantitative Traits Model-Based Multifactor Dimensionality Reduction
Mahachie John, Jestinah ULg; Cattaert, Tom ULg; Van Lishout, François ULg et al

in PLoS ONE (2012)

Identifying gene-gene interactions or gene-environment interactions in studies of human complex diseases remains a big challenge in genetic epidemiology. An additional challenge, often forgotten, is to ... [more ▼]

Identifying gene-gene interactions or gene-environment interactions in studies of human complex diseases remains a big challenge in genetic epidemiology. An additional challenge, often forgotten, is to account for important lower-order genetic effects. These may hamper the identification of genuine epistasis. If lower-order genetic effects contribute to the genetic variance of a trait, identified statistical interactions may simply be due to a signal boost of these effects. In this study, we restrict attention to quantitative traits and bi-allelic SNPs as genetic markers. Moreover, our interaction study focuses on 2- way SNP-SNP interactions. Via simulations, we assess the performance of different corrective measures for lower-order genetic effects in Model-Based Multifactor Dimensionality Reduction epistasis detection, using additive and co-dominant coding schemes. Performance is evaluated in terms of power and familywise error rate. Our simulations indicate that empirical power estimates are reduced with correction of lower-order effects, likewise familywise error rates. Easy-to-use automatic SNP selection procedures, SNP selection based on ‘‘top’’ findings, or SNP selection based on p-value criterion for interesting main effects result in reduced power but also almost zero false positive rates. Always accounting for main effects in the SNP-SNP pair under investigation during Model-Based Multifactor Dimensionality Reduction analysis adequately controls false positive epistasis findings. This is particularly true when adopting a co-dominant corrective coding scheme. In conclusion, automatic search procedures to identify lower-order effects to correct for during epistasis screening should be avoided. The same is true for procedures that adjust for lower-order effects prior to Model-Based Multifactor Dimensionality Reduction and involve using residuals as the new trait. We advocate using ‘‘on-the-fly’’ lower-order effects adjusting when screening for SNP-SNP interactions using Model-Based Multifactor Dimensionality Reduction analysis. [less ▲]

Detailed reference viewed: 63 (29 ULg)
Full Text
See detailAn Efficient Algorithm to Perform Multiple Testing in Epistasis Screening
Van Lishout, François ULg; Cattaert, Tom ULg; Mahachie John, Jestinah ULg et al

Conference (2011, December 13)

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown exponentially over the last few years. It has been marked by promising methodological developments ... [more ▼]

Background: Research in epistasis or gene-gene interaction detection for human complex traits has grown exponentially over the last few years. It has been marked by promising methodological developments, improved translation efforts of statistical epistasis to biological epistasis and attempts to integrate different omics information sources into the epistasis screening to enhance power. The quest for gene-gene interactions poses severe multiple-testing problems. In this context, the maxT algorithm is one technique to control the false-positive rate. However, the memory needed by this algorithm rises linearly with the amount of hypothesis tests. In main-effects detection, this is not a problem since the memory required is thus proportional to the number of SNPs. In contrast, gene-gene interaction studies will require a memory proportional to the squared amount of SNPs. A genome wide epistasis would therefore require terabytes of memory. Hence, cache problems are likely to occur, increasing the computation time. Methods: In this work we present a new version of maxT, requiring an amount of memory independent from the number of genetic effects to be investigated. This algorithm was implemented in C++ in our epistasis screening software MB-MDR-2.6.2 and compared to MB-MDR's first implementation as an R-package (Calle et al., Bioinformatics 2010). We evaluate the new implementation in terms of memory efficiency and speed using simulated data. The software is illustrated on real-life data for Crohn's disease. Results: The sequential version of MBMDR-2.6.2 is approximately 5,500 times faster than its R counterparts. The parallel version (tested on a cluster composed of 14 blades, containing each 4 quad-cores Intel Xeon CPU E5520@2.27 GHz) is approximately 900,000 times faster than the latter, for results of the same quality on the simulated data. It analyses all gene-gene interactions of a dataset of 100,000 SNPs typed on 1000 individuals within 4 days. Our program found 14 SNP-SNP interactions with a p-value less than 0.05 on the real-life Crohn’s disease data. Conclusions: Our software is able to solve large-scale SNP-SNP interactions problems within a few days, without using much memory. A new implementation to reach genome wide epistasis screening is under construction. In the context of Crohn's disease, MBMDR-2.6.2 found signal in regions well known in the field and our results could be explained from a biological point of view. This demonstrates the power of our software to find relevant phenotype-genotype associations. [less ▲]

Detailed reference viewed: 60 (25 ULg)
Peer Reviewed
See detailComparison Of Different Methods For Detecting Gene-Gene Interactions In Case-Control Data
Cattaert, Tom ULg; Rial Garcia, J. A.; Gusareva, Elena ULg et al

Poster (2011, September 19)

It is generally believed that epistasis makes an important contribution to the genetic architecture of complex disease, and numerous statistical and bioinformatics methods have been developed to detect it ... [more ▼]

It is generally believed that epistasis makes an important contribution to the genetic architecture of complex disease, and numerous statistical and bioinformatics methods have been developed to detect it. We compare several state-of-the-art epistasis detection methods in terms of empirical power, type-I error control, and CPU time. The methods compared include Model-Based Multifactor Dimensionality Reduction (MB-MDR) [1, 2], BOolean Operation-based Screening and Testing (BOOST) [3], EPIBLASTER [4], Random Jungle (RJ) [5], Logistic Regression and PLINK. Our comparative study is based on an extensive simulation study using different two-locus models, exhibiting both main effects and epistasis [3]. In these simulations, 100 SNPs are generated, no LD between them. All genotypes are assumed to be in Hardy-Weinberg equilibrium. Furthermore, 2 disease-associated SNPs are selected, with MAFs set to 0.1, 0.2 and 0.4. The MAFs of the non-disease associated SNPs are uniformly distributed on [0.05, 0.5]. In order to achieve high accuracy in empirical power estimation, all simulation settings involve 1000 replicates. All methods are applied to WTCCC Crohn's Disease data. [1] Calle, M.L. et al. (2008), Tech. Rep. No. 24, Dep. of Systems Biology, Univ. de Vic [2] Cattaert, T. et al. (2011), Ann. Hum. Gen. 75, 78-89 [3] Wan, X. et al. (2010), Am. J. Hum. Gen. 87, 325-340 [4] Kam-Thong, T. et al. (2011), Eur. J. Hum. Gen. 19, 465-471 [5] Schwartz, D.F. et al. (2010), Bioinf. 26, 1752-1758 [less ▲]

Detailed reference viewed: 83 (5 ULg)
Peer Reviewed
See detailA robustness study to investigate the performance of parametric and non-parametric tests used in Model-Based Multifactor Dimensionality Reduction Epistasis Detection.
Mahachie John, Jestinah ULg; Gusareva, Elena ULg; Van Lishout, François ULg et al

Poster (2011, September 19)

Model-Based Multifactor Dimensionality Reduction (MB-MDR) is data mining technique to identify gene-gene interactions among 1000nds of SNPs in a fast way, without making assumptions about the mode of ... [more ▼]

Model-Based Multifactor Dimensionality Reduction (MB-MDR) is data mining technique to identify gene-gene interactions among 1000nds of SNPs in a fast way, without making assumptions about the mode of genetic interactions. By construction, one of the implementations of MB-MDR involves testing one multi-locus genotype cell versus the remaining cells, hereby creating two imbalanced groups for trait distribution comparison. To date, for continuous traits, we have adopted a standard F-test to compare these groups. When normality assumption or homoscedasticity no longer hold, highly inflated results are to be expected. The power and type I error control of MB-MDR under these assumptions has been thoroughly investigated in Mahachie John et al [1]. The aim of this study is to assess, through simulations, the effects of ANOVA model violations on the performance of Model-Based Multifactor Dimensionality Reduction (MB-MDR). We quantify their effect on MB-MDR using default options, but at the same time introduce alternative options with increased performance. The better handling of imbalanced data using robust approaches [2] within a MB-MDR context is exemplified on real data for asthma-related phenotypes. 1. EJHG (2011), Early view 2. David Freedman, Statistical Models: Theory and Practice, Cambridge University Press (2000), ISBN 978-0521671057 [less ▲]

Detailed reference viewed: 30 (6 ULg)
Peer Reviewed
See detailGenome-wide epistasis screening for Crohns’ disease
Gusareva, Elena ULg; Van Steen, Kristel ULg

Poster (2011, September 19)

Genome-wide association (GWA) studies of Crohn's disease have identified numerous genes. However, a substantial portion of the heritability of this disease remains unexplained. Some gene variants, not ... [more ▼]

Genome-wide association (GWA) studies of Crohn's disease have identified numerous genes. However, a substantial portion of the heritability of this disease remains unexplained. Some gene variants, not detectable via main effects GWA study, may manifest themselves only in interaction with other variants. To search for interacting genes involved in the regulation of Crohn's disease, we performed GWA epistasis screening in a large human cohort (1851 cases/2938 controls) belonging to the Wellcome Trust Case Control Consortium (WTCCC). All subjects were genotyped with the GeneChip 500K Mapping Array Set (Affymetrix chip). SNPs that passed our quality control (359,479 SNPs) were processed in Biofilter (a software package that looks for candidate epistatic genes contributing to disease risk) giving rise to 14,185 SNPs. Subsequent MB-MDR epistasis screening discovered four pairs of interacting SNPs on chromosome 4q35.1 and eight pairs on chromosome 11q23.2. The identified pairs of SNPs were confirmed with synergy-based measures. Notably, despite their mapping to the same genomic regions, the interacting SNPs were not in LD (r^2 < 0.5). Our findings support the idea of close chromosomal localization of two pairs of interacting genes that are involved in development of Crohn's disease. [less ▲]

Detailed reference viewed: 42 (6 ULg)
Full Text
Peer Reviewed
See detailA genome-wide linkage study of individuals with high scores on NEO personality traits
Amin, Najaf; Schuur, M.; Gusareva, Elena ULg et al

in Molecular Psychiatry (2011)

The NEO-Five-Factor Inventory divides human personality traits into five dimensions: neuroticism, extraversion, openness, conscientiousness and agreeableness. In this study, we sought to identify regions ... [more ▼]

The NEO-Five-Factor Inventory divides human personality traits into five dimensions: neuroticism, extraversion, openness, conscientiousness and agreeableness. In this study, we sought to identify regions harboring genes with large effects on the five NEO personality traits by performing genome-wide linkage analysis of individuals scoring in the extremes of these traits ( > 90th percentile). Affected-only linkage analysis was performed using an Illumina 6K linkage array in a family-based study, the Erasmus Rucphen Family study. We subsequently determined whether distinct, segregating haplotypes found with linkage analysis were associated with the trait of interest in the population. Finally, a dense single-nucleotide polymorphism genotyping array (Illumina 318K) was used to search for copy number variations (CNVs) in the associated regions. In the families with extreme phenotype scores, we found significant evidence of linkage for conscientiousness to 20p13 (rs1434789, log of odds (LOD) = 5.86) and suggestive evidence of linkage (LOD > 2.8) for neuroticism to 19q, 21q and 22q, extraversion to 1p, 1q, 9p and12q, openness to 12q and 19q, and agreeableness to 2p, 6q, 17q and 21q. Further analysis determined haplotypes in 21q22 for neuroticism (P-values = 0.009, 0.007), in 17q24 for agreeableness (marginal P-value = 0.018) and in 20p13 for conscientiousness (marginal P-values = 0.058, 0.038) segregating in families with large contributions to the LOD scores. No evidence for CNVs in any of the associated regions was found. Our findings imply that there may be genes with relatively large effects involved in personality traits, which may be identified with next-generation sequencing techniques. [less ▲]

Detailed reference viewed: 30 (0 ULg)