References of "Huynh-Thu, Vân Anh"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailIdentification of a microRNA landscape targeting the PI3K/Akt signaling pathway in inflammation-induced colorectal carcinogenesis
JOSSE, Claire ULg; Bouznad, Nassim ULg; Geurts, Pierre ULg et al

in American Journal of Physiology - Gastrointestinal and Liver Physiology (2014), 306

Inflammation can contribute to tumor formation; however, markers that predict progression are still lacking. In the present study, the well-established azoxymethane (AOM)/dextran sulfate sodium (DSS ... [more ▼]

Inflammation can contribute to tumor formation; however, markers that predict progression are still lacking. In the present study, the well-established azoxymethane (AOM)/dextran sulfate sodium (DSS)-induced mouse model of colitis-associated cancer was used to analyze microRNA (miRNA) modulation accompanying inflammation-induced tumor development and to determine whether inflammation-triggered miRNA alterations affect the expression of genes or pathways involved in cancer. A miRNA microarray experiment was performed to establish miRNA expression profiles in mouse colon at early and late time points during inflammation and/or tumor growth. Chronic inflammation and carcinogenesis were associated with distinct changes in miRNA expression. Nevertheless, prediction algorithms of miRNA-mRNA interactions and computational analyses based on ranked miRNA lists consistently identified putative target genes that play essential roles in tumor growth or that belong to key carcinogenesis-related signaling pathways. We identified PI3K/Akt and the insulin growth factor-1 (IGF-1) as major pathways being affected in the AOM/DSS model. DSS-induced chronic inflammation downregulates miR-133a and miR-143/145, which is reportedly associated with human colorectal cancer and PI3K/Akt activation. Accordingly, conditioned medium from inflammatory cells decreases the expression of these miRNA in colorectal adenocarcinoma Caco-2 cells. Overexpression of miR-223, one of the main miRNA showing strong upregulation during AOM/DSS tumor growth, inhibited Akt phosphorylation and IGF-1R expression in these cells. Cell sorting from mouse colons delineated distinct miRNA expression patterns in epithelial and myeloid cells during the periods preceding and spanning tumor growth. Hence, cell-type-specific miRNA dysregulation and subsequent PI3K/Akt activation may be involved in the transition from intestinal inflammation to cancer. [less ▲]

Detailed reference viewed: 13 (1 ULg)
Full Text
Peer Reviewed
See detailGene regulatory network inference from systems genetics data using tree-based methods
Huynh-Thu, Vân Anh ULg; Wehenkel, Louis ULg; Geurts, Pierre ULg

in de la Fuente, Alberto (Ed.) Gene Network Inference - Verification of Methods for Systems Genetics Data (2013)

One of the pressing open problems of computational systems biology is the elucidation of the topology of gene regulatory networks (GRNs). In an attempt to solve this problem, the idea of systems genetics ... [more ▼]

One of the pressing open problems of computational systems biology is the elucidation of the topology of gene regulatory networks (GRNs). In an attempt to solve this problem, the idea of systems genetics is to exploit the natural variations that exist between the DNA sequences of related individuals and that can represent the randomized and multifactorial perturbations necessary to recover GRNs. In this chapter, we present new methods, called GENIE3-SG-joint and GENIE3- SG-sep, for the inference of GRNs from systems genetics data. Experiments on the artificial data of the StatSeq benchmark and of the DREAM5 Systems Genetics challenge show that exploiting jointly expression and genetic data is very helpful for recovering GRNs, and one of our methods outperforms by a large extent the official best performing method of the DREAM5 challenge. [less ▲]

Detailed reference viewed: 66 (16 ULg)
Full Text
Peer Reviewed
See detailMyelin-Derived Lipids Modulate Macrophage Activity by Liver X Receptor Activation
Bogie, Jeroen F. J.; Timmermans, Silke; Huynh-Thu, Vân Anh ULg et al

in PLoS ONE (2012), 7(9), 44998

Multiple sclerosis is a chronic, inflammatory, demyelinating disease of the central nervous system in which macrophages and microglia play a central role. Foamy macrophages and microglia, containing ... [more ▼]

Multiple sclerosis is a chronic, inflammatory, demyelinating disease of the central nervous system in which macrophages and microglia play a central role. Foamy macrophages and microglia, containing degenerated myelin, are abundantly found in active multiple sclerosis lesions. Recent studies have described an altered macrophage phenotype after myelin internalization. However, it is unclear by which mechanisms myelin affects the phenotype of macrophages and how this phenotype can influence lesion progression. Here we demonstrate, by using genome wide gene expression analysis, that myelin-phagocytosing macrophages have an enhanced expression of genes involved in migration, phagocytosis and inflammation. Interestingly, myelin internalization also induced the expression of genes involved in liver-X-receptor signaling and cholesterol efflux. In vitro validation shows that myelin-phagocytosing macrophages indeed have an increased capacity to dispose intracellular cholesterol. In addition, myelin suppresses the secretion of the pro-inflammatory mediator IL-6 by macrophages, which was mediated by activation of liver-X-receptor b. Our data show that myelin modulates the phenotype of macrophages by nuclear receptor activation, which may subsequently affect lesion progression in demyelinating diseases such as multiple sclerosis. [less ▲]

Detailed reference viewed: 22 (5 ULg)
Full Text
Peer Reviewed
See detailWisdom of crowds for robust gene network inference
Marbach, Daniel; Costello, James C.; Küffner, Robert et al

in Nature Methods (2012), 9

Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a ... [more ▼]

Reconstructing gene regulatory networks from high-throughput data is a long-standing challenge. Through the Dialogue on Reverse Engineering Assessment and Methods (DREAM) project, we performed a comprehensive blind assessment of over 30 network inference methods on Escherichia coli, Staphylococcus aureus, Saccharomyces cerevisiae and in silico microarray data. We characterize the performance, data requirements and inherent biases of different inference approaches, and we provide guidelines for algorithm application and development. We observed that no single inference method performs optimally across all data sets. In contrast, integration of predictions from multiple inference methods shows robust and high performance across diverse data sets. We thereby constructed high-confidence networks for E. coli and S. aureus, each comprising ~ 1,700 transcriptional interactions at a precision of ~50%. We experimentally tested 53 previously unobserved regulatory interactions in E. coli, of which 23 (43%) were supported. Our results establish community-based methods as a powerful and robust tool for the inference of transcriptional gene regulatory networks. [less ▲]

Detailed reference viewed: 156 (27 ULg)
Full Text
Peer Reviewed
See detailStatistical interpretation of machine learning-based feature importance scores for biomarker discovery
Huynh-Thu, Vân Anh ULg; Saeys, Yvan; Wehenkel, Louis ULg et al

in Bioinformatics (2012), 28(13), 1766-1774

Motivation: Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can ... [more ▼]

Motivation: Univariate statistical tests are widely used for biomarker discovery in bioinformatics. These procedures are simple, fast and their output is easily interpretable by biologists but they can only identify variables that provide a significant amount of information in isolation from the other variables. As biological processes are expected to involve complex interactions between variables, univariate methods thus potentially miss some informative biomarkers. Variable relevance scores provided by machine learning techniques, however, are potentially able to highlight multivariate interacting effects, but unlike the p-values returned by univariate tests, these relevance scores are usually not statistically interpretable. This lack of interpretability hampers the determination of a relevance threshold for extracting a feature subset from the rankings and also prevents the wide adoption of these methods by practicians. Results: We evaluated several, existing and novel, procedures that extract relevant features from rankings derived from machine learning approaches. These procedures replace the relevance scores with measures that can be interpreted in a statistical way, such as p-values, false discovery rates, or family wise error rates, for which it is easier to determine a significance level. Experiments were performed on several artificial problems as well as on real microarray datasets. Although the methods differ in terms of computing times and the tradeoff, they achieve in terms of false positives and false negatives, some of them greatly help in the extraction of truly relevant biomarkers and should thus be of great practical interest for biologists and physicians. As a side conclusion, our experiments also clearly highlight that using model performance as a criterion for feature selection is often counter-productive. [less ▲]

Detailed reference viewed: 145 (32 ULg)
Full Text
See detailMachine learning-based feature ranking: Statistical interpretation and gene network inference
Huynh-Thu, Vân Anh ULg

Doctoral thesis (2012)

Machine learning techniques, and in particular supervised learning methods, are nowadays widely used in bioinformatics. Two prominent applications that we target specifically in this thesis are biomarker ... [more ▼]

Machine learning techniques, and in particular supervised learning methods, are nowadays widely used in bioinformatics. Two prominent applications that we target specifically in this thesis are biomarker discovery and regulatory network inference. These two problems are commonly addressed through the use of feature ranking methods that order the input features of a supervised learning problem from the most to the less relevant for predicting the output. This thesis presents, on the one hand, methodological contributions around machine learning-based feature ranking techniques and on the other hand, more applicative contributions on gene regulatory network inference. Our methodological contributions focus on the problem of selecting truly relevant features from machine learning-based feature rankings. Unlike the p-values returned by univariate tests, relevance scores derived from machine learning techniques to rank the features are usually not statistically interpretable. This lack of interpretability makes the identification of the truly relevant features among the top-ranked ones a very difficult task and hence prevents the wide adoption of these methods by practitioners. Our first contribution in this field concerns a procedure, based on permutation tests, that estimates for each subset of top-ranked features the probability for that subset to contain at least one irrelevant feature (called CER for "conditional error rate"). As a second contribution, we performed a large-scale evaluation of several, existing or novel, procedures, including our CER method, that all replace the original relevance scores with measures that can be interpreted in a statistical way. These procedures, which were assessed on several artificial and real datasets, differ greatly in terms of computing times and the tradeoff they achieve in terms of false positives and false negatives. Our experiments also clearly highlight that using model performance as a criterion for feature selection is often counter-productive. The problem of gene regulatory network inference can be formulated as several feature selection problems, each one aiming at discovering the regulators of one target gene. Within this family of methods, we developed the GENIE3 algorithm that exploits feature rankings derived from tree-based ensemble methods to infer gene networks from steady-state gene expression data. In a second step, we derived two extensions of GENIE3 that aim to infer regulatory networks from other types of data. The first extension exploits expression data provided by time course experiments, while the second extension is related to genetical genomics datasets, which contain expression data together with information about genetic markers. GENIE3 was best performer in the DREAM4 In Silico Multifactorial challenge in 2009 and in the DREAM5 Network Inference challenge in 2010, and its extensions perform very well compared to other methods on several artificial datasets. [less ▲]

Detailed reference viewed: 393 (37 ULg)
Full Text
Peer Reviewed
See detailInferring Regulatory Networks from Expression Data Using Tree-Based Methods
Huynh-Thu, Vân Anh ULg; Irrthum, Alexandre ULg; Wehenkel, Louis ULg et al

in PLoS ONE (2010), 5(9), 12776

One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray ... [more ▼]

One of the pressing open problems of computational systems biology is the elucidation of the topology of genetic regulatory networks (GRNs) using high throughput genomic data, in particular microarray gene expression data. The Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge aims to evaluate the success of GRN inference algorithms on benchmarks of simulated data. In this article, we present GENIE3, a new algorithm for the inference of GRNs that was best performer in the DREAM4 In Silico Multifactorial challenge. GENIE3 decomposes the prediction of a regulatory network between p genes into p different regression problems. In each of the regression problems, the expression pattern of one of the genes (target gene) is predicted from the expression patterns of all the other genes (input genes), using tree-based ensemble methods Random Forests or Extra-Trees. The importance of an input gene in the prediction of the target gene expression pattern is taken as an indication of a putative regulatory link. Putative regulatory links are then aggregated over all genes to provide a ranking of interactions from which the whole network is reconstructed. In addition to performing well on the DREAM4 In Silico Multifactorial challenge simulated data, we show that GENIE3 compares favorably with existing algorithms to decipher the genetic regulatory network of Escherichia coli. It doesn't make any assumption about the nature of gene regulation, can deal with combinatorial and non-linear interactions, produces directed GRNs, and is fast and scalable. In conclusion, we propose a new algorithm for GRN inference that performs well on both synthetic and real gene expression data. The algorithm, based on feature selection with tree-based ensemble methods, is simple and generic, making it adaptable to other types of genomic data and interactions. [less ▲]

Detailed reference viewed: 341 (36 ULg)
Full Text
Peer Reviewed
See detailExploiting tree-based variable importances to selectively identify relevant variables
Huynh-Thu, Vân Anh ULg; Wehenkel, Louis ULg; Geurts, Pierre ULg

in JMLR: Workshop and Conference Proceedings (2008), 4

Detailed reference viewed: 93 (36 ULg)
Full Text
See detailDetection of micro-RNA/gene interactions involved in angiogenesis using machine learning techniques
Huynh-Thu, Vân Anh ULg; Hiard, Samuel ULg; Geurts, Pierre ULg et al

Poster (2007, September)

Motivation: Angiogenesis is the process responsible for the growth of new blood vessels from existing ones. It is also associated with the development of cancer, as tumors need to be irrigated by blood ... [more ▼]

Motivation: Angiogenesis is the process responsible for the growth of new blood vessels from existing ones. It is also associated with the development of cancer, as tumors need to be irrigated by blood vessels for growing. New cancer therapies appear that exploit angiogenesis inhibitors, also called angiostatic agents, to asphyxiate and starve the tumors. Better understanding the regulatory mechanisms that control angiogenesis is thus fundamental. Recently, short non-coding RNA molecules, called micro-RNAs, have been discovered that are involved in post- transcriptional regulation of gene expressions. These molecules bind to RNA messengers following the base pairing rules, preventing them from being translated into proteins and/or tagging them for degradation. The main goal of this work is to use computational approaches to identify micro-RNAs involved in angiogenesis. Method: In order to identify genes involved in angiogenesis, bovine endothelial cells were treated by a known angiogenesis inhibitor [1], prolactin 16K, and their gene expression profile was compared to the profile of untreated cells. The genes were then divided into three classes: up-regulated, down-regulated, and unaffected genes. The 3'UTR regions of these genes were then analysed by machine learning techniques. Different approaches were considered. First, we described each gene by a vector of motif counts in their 3'UTR regions and used machine learning techniques to rank the motifs according to their relevance for separating the genes into the different classes. We considered successively motifs corresponding to the seeds of known micro- RNAs and also all possible motifs of a given length. To rank the motifs, we compared ensemble of decision trees and linear support vector machines. Second, we considered an approach called Segment and Combine that was proposed in [2]. Finally, we also carried out an exhaustive search of all motifs of a given length that satisfy some constraints on specificity and coverage with respect to a given gene category. Results: The ability of the different approaches at identifying relevant motifs was first assessed on genes predicted to be the target of some known miRNAs. In this simple setting, most methods were able to identify the micro-RNA seed. The results obtained on the genes regulated by prolactin 16K are also very encouraging. We were able to identify one micro-RNA already known to play a role in angiogenesis and several motifs are predicted by different approaches as very specific of up- or down-regulation by prolactin 16K. Their relationship with known micro-RNAs is certainly worth exploring. Conclusion: Machine learning approaches are promising techniques for the identification of micro-RNA/gene interactions. Future work will concern the application of the same kind of techniques on promoters for the identification of transcription factor binding sites. [less ▲]

Detailed reference viewed: 52 (15 ULg)