References of "Geurts, Pierre"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailA Machine Learning Approach for Material Detection in Hyperspectral Images
Marée, Raphaël ULg; Stevens, Benjamin ULg; Geurts, Pierre ULg et al

in Proc. 6th IEEE Workshop on Object Tracking and Classification Beyond and in the Visible Spectrum (OTCBVS-CVPR09) (2009)

In this paper we propose a machine learning approach for the detection of gaseous traces in thermal infra red hyperspectral images. It exploits both spectral and spatial information by extracting subcubes ... [more ▼]

In this paper we propose a machine learning approach for the detection of gaseous traces in thermal infra red hyperspectral images. It exploits both spectral and spatial information by extracting subcubes and by using extremely randomized trees with multiple outputs as a classifier. Promising results are shown on a dataset of more than 60 hypercubes. [less ▲]

Detailed reference viewed: 54 (14 ULg)
Full Text
Peer Reviewed
See detailRaw genotypes vs haplotype blocks for genome wide association studies by random forests
Botta, Vincent ULg; Hansoul, Sarah ULg; Geurts, Pierre ULg et al

in Proc. of MLSB 2008, second workshop on Machine Learning in Systems Biology (2008, September)

We consider two different representations of the input data for genome-wide association studies using random forests, namely raw genotypes described by a few thousand to a few hundred thousand discrete ... [more ▼]

We consider two different representations of the input data for genome-wide association studies using random forests, namely raw genotypes described by a few thousand to a few hundred thousand discrete variables each one describing a single nucleotide polymorphism, and haplotype block contents, represented by the combinations of about 10 to 100 adjacent and correlated genotypes. We adapt random forests to exploit haplotype blocks, and compare this with the use of raw genotypes, in terms of predictive power and localization of causal mutations, by using simulated datasets with one or two interacting effects. [less ▲]

Detailed reference viewed: 113 (35 ULg)
Full Text
See detailPrediction of genetic risk of complex diseases by supervised learning
Botta, Vincent ULg; Geurts, Pierre ULg; Hansoul, Sarah et al

Scientific conference (2008, May)

Detailed reference viewed: 8 (2 ULg)
Full Text
Peer Reviewed
See detailProteomics for prediction and characterization of response to infliximab in Crohn's disease: a pilot study.
Meuwis, Marie-Alice ULg; Fillet, Marianne ULg; Lutteri, Laurence ULg et al

in Clinical Biochemistry (2008), 41(12), 960-7

OBJECTIVES: Infliximab is the first anti-TNFalpha accepted by the Food and Drug Administration for use in inflammatory bowel disease treatment. Few clinical, biological and genetic factors tend to predict ... [more ▼]

OBJECTIVES: Infliximab is the first anti-TNFalpha accepted by the Food and Drug Administration for use in inflammatory bowel disease treatment. Few clinical, biological and genetic factors tend to predict response in Crohn's disease (CD) patient subcategories, none widely predicting response to infliximab. DESIGN AND METHODS: Twenty CD patients showing clinical response or non response to infliximab were used for serum proteomic profiling on Surface Enhanced Lazer Desorption Ionisation-Time of Flight-Mass Spectrometry (SELDI-TOF-MS), each before and after treatment. Univariate and multivariate data analysis were performed for prediction and characterization of response to infliximab. RESULTS: We obtained a model of classification predicting response to treatment and selected relevant potential biomarkers, among which platelet aggregation factor 4 (PF4). We quantified PF4, sCD40L and IL-6 by ELISA for correlation studies. CONCLUSIONS: This first proteomic pilot study on response to infliximab in CD suggests association between platelet metabolism and response to infliximab and requires validation studies on a larger cohort of patients. [less ▲]

Detailed reference viewed: 116 (25 ULg)
Full Text
Peer Reviewed
See detailExploiting tree-based variable importances to selectively identify relevant variables
Huynh-Thu, Vân Anh ULg; Wehenkel, Louis ULg; Geurts, Pierre ULg

in JMLR: Workshop and Conference Proceedings (2008), 4

Detailed reference viewed: 94 (36 ULg)
Full Text
Peer Reviewed
See detailExploiting tree-based variable importances to selectively identify relevant variables
Huynh-Thu, Vân Anh; Wehenkel, Louis ULg; Geurts, Pierre ULg

in Proc. of FSDM08, ECML/PKDD Workshop on New challenges for feature selection in data mining and knowledge discovery (2008)

Detailed reference viewed: 54 (4 ULg)
Peer Reviewed
See detailCompositional protein analysis of HDL by SELDI-TOF MS during experimental endotoxemia
Levels, Johannes HM; Marée, Raphaël ULg; Geurts, Pierre ULg et al

Poster (2008)

Detailed reference viewed: 18 (0 ULg)
Full Text
Peer Reviewed
See detailEstimation of rotor angles of synchronous machines using artificial neural networks and local PMU-based quantities
Del Angel, A.; Geurts, Pierre ULg; Ernst, Damien ULg et al

in Neurocomputing (2007), 70(16-18), 2668-2678

This paper investigates a possibility for estimating rotor angles in the time frame of transient (angle) stability of electric power systems, for use in real-time. The proposed dynamic state estimation ... [more ▼]

This paper investigates a possibility for estimating rotor angles in the time frame of transient (angle) stability of electric power systems, for use in real-time. The proposed dynamic state estimation technique is based on the use of voltage and current phasors obtained from a phasor measurement unit supposed to be installed on the extra-high voltage side of the substation of a power plant, together with a multilayer perceptron trained off-line from simulations. We demonstrate that an intuitive approach to directly map phasor measurement inputs to the neural network to generator rotor angle does not offer satisfactory results. We found out that a good way to approach the angle estimation problem is to use two neural networks in order to estimate the sin(delta) and cos(delta) of the angle and recover the latter from these values by simple post-processing. Simulation results on a part of the Mexican interconnected system show that the approach could yield satisfactory accuracy for realtime monitoring and control of transient instability. (c) 2007 Elsevier B.V. All rights reserved. [less ▲]

Detailed reference viewed: 117 (8 ULg)
Full Text
See detailDetection of micro-RNA/gene interactions involved in angiogenesis using machine learning techniques
Huynh-Thu, Vân Anh ULg; Hiard, Samuel ULg; Geurts, Pierre ULg et al

Poster (2007, September)

Motivation: Angiogenesis is the process responsible for the growth of new blood vessels from existing ones. It is also associated with the development of cancer, as tumors need to be irrigated by blood ... [more ▼]

Motivation: Angiogenesis is the process responsible for the growth of new blood vessels from existing ones. It is also associated with the development of cancer, as tumors need to be irrigated by blood vessels for growing. New cancer therapies appear that exploit angiogenesis inhibitors, also called angiostatic agents, to asphyxiate and starve the tumors. Better understanding the regulatory mechanisms that control angiogenesis is thus fundamental. Recently, short non-coding RNA molecules, called micro-RNAs, have been discovered that are involved in post- transcriptional regulation of gene expressions. These molecules bind to RNA messengers following the base pairing rules, preventing them from being translated into proteins and/or tagging them for degradation. The main goal of this work is to use computational approaches to identify micro-RNAs involved in angiogenesis. Method: In order to identify genes involved in angiogenesis, bovine endothelial cells were treated by a known angiogenesis inhibitor [1], prolactin 16K, and their gene expression profile was compared to the profile of untreated cells. The genes were then divided into three classes: up-regulated, down-regulated, and unaffected genes. The 3'UTR regions of these genes were then analysed by machine learning techniques. Different approaches were considered. First, we described each gene by a vector of motif counts in their 3'UTR regions and used machine learning techniques to rank the motifs according to their relevance for separating the genes into the different classes. We considered successively motifs corresponding to the seeds of known micro- RNAs and also all possible motifs of a given length. To rank the motifs, we compared ensemble of decision trees and linear support vector machines. Second, we considered an approach called Segment and Combine that was proposed in [2]. Finally, we also carried out an exhaustive search of all motifs of a given length that satisfy some constraints on specificity and coverage with respect to a given gene category. Results: The ability of the different approaches at identifying relevant motifs was first assessed on genes predicted to be the target of some known miRNAs. In this simple setting, most methods were able to identify the micro-RNA seed. The results obtained on the genes regulated by prolactin 16K are also very encouraging. We were able to identify one micro-RNA already known to play a role in angiogenesis and several motifs are predicted by different approaches as very specific of up- or down-regulation by prolactin 16K. Their relationship with known micro-RNAs is certainly worth exploring. Conclusion: Machine learning approaches are promising techniques for the identification of micro-RNA/gene interactions. Future work will concern the application of the same kind of techniques on promoters for the identification of transcription factor binding sites. [less ▲]

Detailed reference viewed: 80 (17 ULg)
Full Text
Peer Reviewed
See detailRandom Subwindows and Randomized Trees for Image Retrieval, Classification, and Annotation
Marée, Raphaël ULg; Dumont, Marie; Geurts, Pierre ULg et al

Poster (2007, July 22)

Detailed reference viewed: 38 (5 ULg)
Full Text
Peer Reviewed
See detailMachine-learnt versus analytical models of TCP throughput
El Khayat, Ibtissam; Geurts, Pierre ULg; Leduc, Guy ULg

in Computer Networks (2007), 51(10), 2631-2644

We first study the accuracy of two well-known analytical models of the average throughput of long-term TCP flows, namely the so-called SQRT and PFTK models, and show that these models are far from being ... [more ▼]

We first study the accuracy of two well-known analytical models of the average throughput of long-term TCP flows, namely the so-called SQRT and PFTK models, and show that these models are far from being accurate in general. Our simulations, based on a large set of long-term TCP sessions, show that 70% of their predictions exceed the boundaries of TCP-Friendliness, thus questioning their use in the design of new TCP-Friendly transport protocols. We then investigate the reasons of this inaccuracy, and show that it is largely due to the lack of discrimination between the two packet loss detection methods used by TCP, namely by triple duplicate acknowledgements or by timeout: expirations. We then apply various machine learning techniques to infer new models of the average TCP throughput. We show that they are more accurate than the SQRT and PFTK models, even without the above discrimination, and are further improved when we allow the machine-learnt models to distinguish the two loss detection techniques. Although our models are not analytical formulas, they can be plugged in transport protocols to make them TCP-Friendly. Our results also suggest that analytical models of the TCP throughput should certainly benefit from the incorporation of the timeout loss rate. (C) 2006 Elsevier B.V. All rights reserved. [less ▲]

Detailed reference viewed: 34 (3 ULg)
Full Text
Peer Reviewed
See detailInferring biological networks with output kernel trees
Geurts, Pierre ULg; Touleimat, Nizar; Dutreix, Marie et al

in BMC Bioinformatics (2007), 8(Suppl. 2), 4

Background: Elucidating biological networks between proteins appears nowadays as one of the most important challenges in systems biology. Computational approaches to this problem are important to ... [more ▼]

Background: Elucidating biological networks between proteins appears nowadays as one of the most important challenges in systems biology. Computational approaches to this problem are important to complement high-throughput technologies and to help biologists in designing new experiments. In this work, we focus on the completion of a biological network from various sources of experimental data. Results: We propose a new machine learning approach for the supervised inference of biological networks, which is based on a kernelization of the output space of regression trees. It inherits several features of tree-based algorithms such as interpretability, robustness to irrelevant variables, and input scalability. We applied this method to the inference of two kinds of networks in the yeast S. cerevisiae: a protein-protein interaction network and an enzyme network. In both cases, we obtained results competitive with existing approaches. We also show that our method provides relevant insights on input data regarding their potential relationship with the existence of interactions. Furthermore, we confirm the biological validity of our predictions in the context of an analysis of gene expression data. Conclusion: Output kernel tree based methods provide an efficient tool for the inference of biological networks from experimental data. Their simplicity and interpretability should make them of great value for biologists. [less ▲]

Detailed reference viewed: 80 (29 ULg)
Full Text
Peer Reviewed
See detailBiomarker discovery for inflammatory bowel disease, using proteomic serum profiling
Meuwis, Marie-Alice ULg; Fillet, Marianne ULg; Geurts, Pierre ULg et al

in Biochemical Pharmacology (2007), 73(9), 1422-1433

Crohn's disease and ulcerative colitis known as inflammatory bowel diseases (IBD) are chronic immuno-inflammatory pathologies of the gastrointestinal tract. These diseases are multifactorial, polygenic ... [more ▼]

Crohn's disease and ulcerative colitis known as inflammatory bowel diseases (IBD) are chronic immuno-inflammatory pathologies of the gastrointestinal tract. These diseases are multifactorial, polygenic and of unknown etiology. Clinical presentation is non-specific and diagnosis is based on clinical, endoscopic, radiological and histological criteria. Novel markers are needed to improve early diagnosis and classification of these pathologies. We performed a study with 120 serum samples collected from patients classified in 4 groups (30 Crohn, 30 ulcerative colitis, 30 inflammatory controls and 30 healthy controls) according to accredited criteria. We compared protein sera profiles obtained with a Surface Enhanced Laser Desorption Ionization-Time of Flight-Mass Spectrometer (SELDI-TOF-MS). Data analysis with univariate process and a multivariate statistical method based on multiple decision trees algorithms allowed us to select some potential biomarkers. Four of them were identified by mass spectrometry and antibody based methods. Multivariate analysis generated models that could classify samples with good sensitivity and specificity (minimum 80%) discriminating groups of patients. This analysis was used as a tool to classify peaks according to differences in level on spectra through the four categories of patients. Four biomarkers showing important diagnostic value were purified, identified (PF4, MRP8, FIBA and Hpalpha2) and two of these: PF4 and Hpalpha2 were detected in sera by classical methods. SELDI-TOF-MS technology and use of the multiple decision trees method led to protein biomarker patterns analysis and allowed the selection of potential individual biomarkers. Their downstream identification may reveal to be helpful for IBD classification and etiology understanding. [less ▲]

Detailed reference viewed: 147 (15 ULg)
Full Text
Peer Reviewed
See detailRandom Subwindows and Multiple Output Decision Trees for Generic Image Annotation
Dumont, Marie; Marée, Raphaël ULg; Geurts, Pierre ULg et al

Poster (2007)

Detailed reference viewed: 54 (7 ULg)
Full Text
Peer Reviewed
See detailRandom subwindows and extremely randomized trees for image classification in cell biology
Marée, Raphaël ULg; Geurts, Pierre ULg; Wehenkel, Louis ULg

in BMC Cell Biology (2007), 8(Suppl. 1),

Background: With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the ... [more ▼]

Background: With the improvements in biosensors and high-throughput image acquisition technologies, life science laboratories are able to perform an increasing number of experiments that involve the generation of a large amount of images at different imaging modalities/scales. It stresses the need for computer vision methods that automate image classification tasks. Results: We illustrate the potential of our image classification method in cell biology by evaluating it on four datasets of images related to protein distributions or subcellular localizations, and red-blood cell shapes. Accuracy results are quite good without any specific pre-processing neither domain knowledge incorporation. The method is implemented in Java and available upon request for evaluation and research purpose. Conclusion: Our method is directly applicable to any image classification problems. We foresee the use of this automatic approach as a baseline method and first try on various biological image classification problems. [less ▲]

Detailed reference viewed: 93 (21 ULg)
Full Text
Peer Reviewed
See detailContent-based Image Retrieval by Indexing Random Subwindows with Randomized Trees
Marée, Raphaël ULg; Geurts, Pierre ULg; Wehenkel, Louis ULg

in Proc. 8th Asian Conference on Computer Vision (ACCV), LNCS (2007)

We propose a new method for content-based image retrieval which exploits the similarity measure and indexing structure of totally randomized tree ensembles induced from a set of subwindows randomly ... [more ▼]

We propose a new method for content-based image retrieval which exploits the similarity measure and indexing structure of totally randomized tree ensembles induced from a set of subwindows randomly extracted from a sample of images. We also present the possibility of updating the model as new images come in, and the capability of comparing new images using a model previously constructed from a different set of images. The approach is quantitatively evaluated on various types of images with state-of-the-art results despite its conceptual simplicity and computational efficiency [less ▲]

Detailed reference viewed: 40 (1 ULg)
Full Text
Peer Reviewed
See detailGradient boosting for kernelized output spaces
Geurts, Pierre ULg; Wehenkel, Louis ULg; d'Alché-Buc, Florence

in Proceedings of the 24th International Conference on Machine Learning (2007)

Detailed reference viewed: 25 (2 ULg)
Full Text
Peer Reviewed
See detailOn the accuracy of analytical models of TCP throughput
El Khayat, Ibtissam; Geurts, Pierre ULg; Leduc, Guy ULg

in Lecture Notes in Computer Science (2006, May), 3976

Based on a large set of TCP sessions we first study the accuracy of two well-known analytical models (SQRT and PFTK) of the TCP average rate. This study shows that these models are far from being accurate ... [more ▼]

Based on a large set of TCP sessions we first study the accuracy of two well-known analytical models (SQRT and PFTK) of the TCP average rate. This study shows that these models are far from being accurate on average. Actually, our simulations show that 70% of their predictions exceed the boundaries of TCP-Friendliness, thus questioning their use in the design of new TCP-Friendly transport protocols. Our study also shows that the inaccuracy of the PFTK model is largely due to its inability to make the distinction between the two packet loss detection methods used by TCP: triple duplicate acknowledgments or timeout expirations. We then use supervised learning techniques to infer models of the TCP rate. These models show important accuracy improvements when they take into account the two types of losses. This suggests that analytical model of TCP throughput should certainly benefit from the incorporation of the timeout loss rate. [less ▲]

Detailed reference viewed: 29 (1 ULg)