References of "Geurts, Pierre"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailEnsembles of extremely randomized trees and some generic applications
Wehenkel, Louis ULg; Ernst, Damien ULg; Geurts, Pierre ULg

in Proceedings of Robust Methods for Power System State Estimation and Load Forecasting (2006)

In this paper we present a new tree-based ensemble method called “Extra-Trees”. This algorithm averages predictions of trees obtained by partitioning the inputspace with randomly generated splits, leading ... [more ▼]

In this paper we present a new tree-based ensemble method called “Extra-Trees”. This algorithm averages predictions of trees obtained by partitioning the inputspace with randomly generated splits, leading to significant improvements of precision, and various algorithmic advantages, in particular reduced computational complexity and scalability. We also discuss two generic applications of this algorithm, namely for time-series classification and for the automatic inference of near-optimal sequential decision policies from experimental data. [less ▲]

Detailed reference viewed: 51 (5 ULg)
Full Text
Peer Reviewed
See detailAbout automatic learning for advanced sensing, monitoring and control of electric power systems
Wehenkel, Louis ULg; Glavic, Mevludin; Geurts, Pierre ULg et al

in Proceedings of the Second Carnegie Mellon Conference in Electric Power Systems: Monitoring, Sensing, Software and its Valuation for the Changing electric Power Industry (2006)

The paper considers the possible uses of automatic learning for improving power system performance by software methodologies. Automatic learning per se is first reviewed and recent developements of the ... [more ▼]

The paper considers the possible uses of automatic learning for improving power system performance by software methodologies. Automatic learning per se is first reviewed and recent developements of the field are highlighted. Then the authors’ views of its main actual or potential applications related to power system operation and control are described, and in each application present status and needs for further developments are discussed. [less ▲]

Detailed reference viewed: 51 (4 ULg)
Full Text
Peer Reviewed
See detailBiological Image Classification with Random Subwindows and Extra-Trees
Marée, Raphaël ULg; Geurts, Pierre ULg; Wehenkel, Louis ULg

Conference (2006)

We illustrate the potential of our image classification method on three datasets of images at different imaging modalities/scales, from subcellular locations up to human body regions. The method is based ... [more ▼]

We illustrate the potential of our image classification method on three datasets of images at different imaging modalities/scales, from subcellular locations up to human body regions. The method is based on random subwindows extraction and the combination of their classification using ensembles of extremely randomized decision trees. [less ▲]

Detailed reference viewed: 46 (5 ULg)
Full Text
Peer Reviewed
See detailOK3: Méthode d’arbres à sortie noyau pour la prédiction de sorties structurées et l’apprentissage de noyau
Geurts, Pierre ULg; Wehenkel, Louis ULg; d'Alché-Buc, Florence

in Proc. of CAP (Conférence francophone d'apprentissage) (2006)

Detailed reference viewed: 16 (2 ULg)
Full Text
Peer Reviewed
See detailKernelizing the output of tree-based methods
Geurts, Pierre ULg; Wehenkel, Louis ULg; d Alché-Buc, Florence

in Proceedings of the 23rd International Conference on Machine Learning (2006)

We extend tree-based methods to the prediction of structured outputs using a kernelization of the algorithm that allows one to grow trees as soon as a kernel can be defined on the output space. The ... [more ▼]

We extend tree-based methods to the prediction of structured outputs using a kernelization of the algorithm that allows one to grow trees as soon as a kernel can be defined on the output space. The resulting algorithm, called output kernel trees (OK3), generalizes classification and regression trees as well as tree-based ensemble methods in a principled way. It inherits several features of these methods such as interpretability, robustness to irrelevant variables, and input scalability. When only the Gram matrix over the outputs of the learning sample is given, it learns the output kernel as a function of inputs. We show that the proposed algorithm works well on an image reconstruction task and on a biological network inference problem. [less ▲]

Detailed reference viewed: 73 (15 ULg)
Full Text
Peer Reviewed
See detailElucidating the structure of genetic regulatory networks: a study of a second order dynamical model on artificial data
Quach, Minh; Geurts, Pierre ULg; d Alché-Buc, Florence

in Proc. of the 14th European Symposium on Artificial Neural Networks (2006)

Learning regulatory networks from time-series of gene expres- sion is a challenging task. We propose to use synthetic data to analyze the ability of a state-space model to retrieve the network structure ... [more ▼]

Learning regulatory networks from time-series of gene expres- sion is a challenging task. We propose to use synthetic data to analyze the ability of a state-space model to retrieve the network structure while varying a number of relevant problem parameters. ROC curves together with new tools such as spectral clustering of local solutions found by EM are used to analyze these results and provide relevant insights. [less ▲]

Detailed reference viewed: 13 (0 ULg)
Full Text
Peer Reviewed
See detailCompletion of biological networks: the output kernel trees approach
Geurts, Pierre ULg; Touleimat, Nizar; Dutreix, Marie et al

in Proceedings of the the Workshop on Probabilistic Modeling and Machine Learning in Structural and Systems Biology (2006)

Detailed reference viewed: 8 (1 ULg)
Full Text
Peer Reviewed
See detailSegment and combine: a generic approach for supervised learning of invariant classifiers from topologically structured data
Geurts, Pierre ULg; Marée, Raphaël ULg; Wehenkel, Louis ULg

in Proceedings of the Machine Learning Conference of Belgium and The Netherlands (Benelearn) (2006)

A generic method for supervised classification of structured objects is presented. The approach induces a classifier by (i) deriving a surrogate dataset from a pre-classified dataset of structured objects ... [more ▼]

A generic method for supervised classification of structured objects is presented. The approach induces a classifier by (i) deriving a surrogate dataset from a pre-classified dataset of structured objects, by segmenting them into pieces, (ii) learning a model relating pieces to object-classes, (iii) classifying structured objects by combining predictions made for their pieces. The segmentation allows to exploit local information and can be adapted to inject invariances into the resulting classifier. The framework is illustrated on practical sequence, time-series and image classification problems. [less ▲]

Detailed reference viewed: 112 (14 ULg)
Full Text
Peer Reviewed
See detailA Semi-Algebraic Description of Discrete Naive Bayes Models with Two Hidden Classes
Auvray, Vincent ULg; Geurts, Pierre ULg; Wehenkel, Louis ULg

in Proc. Ninth International Symposium on Artificial Intelligence and Mathematics (2006)

Detailed reference viewed: 16 (3 ULg)
Full Text
Peer Reviewed
See detailImproving TCP in wireless networks with an adaptive machine-learnt classifier of packet loss causes
El Khayat, Ibtissam; Geurts, Pierre ULg; Leduc, Guy ULg

in Lecture Notes in Computer Science (2005, May), 3462

TCP understands all packet losses as buffer overflows and reacts to such congestions by reducing its rate. In hybrid wired/wireless networks where a non negligible number of packet losses are due to link ... [more ▼]

TCP understands all packet losses as buffer overflows and reacts to such congestions by reducing its rate. In hybrid wired/wireless networks where a non negligible number of packet losses are due to link errors, TCP is unable to sustain a reasonable rate. In this paper, we propose to extend TCP Newreno with a packet loss classifier built by a supervised learning algorithm called 'decision tree boosting'. The learning set of the classifier is a database of 25,000 packet loss events in a thousand of random topologies. Since a limited percentage of wrong classifications of congestions as link errors is allowed to preserve TCP-Friendliness, our protocol computes this constraint dynamically and tunes a parameter of the classifier accordingly to maximise the TCP rate. Our classifier outperforms the Veno and Westwood classifiers by achieving a higher rate in wireless networks while remaining TCP-Friendly. [less ▲]

Detailed reference viewed: 30 (5 ULg)
Full Text
Peer Reviewed
See detailTree-based batch mode reinforcement learning
Ernst, Damien ULg; Geurts, Pierre ULg; Wehenkel, Louis ULg

in Journal of Machine Learning Research (2005), 6

Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so ... [more ▼]

Reinforcement learning aims to determine an optimal control policy from interaction with a system or from observations gathered from a system. In batch mode, it can be achieved by approximating the so-called Q-function based on a set of four-tuples (x(t), u(t), r(t), x(t+1)) where x(t) denotes the system state at time t, u(t) the control action taken, r(t) the instantaneous reward obtained and x(t+1) the successor state of the system, and by determining the control policy from this Q-function. The Q-function approximation may be obtained from the limit of a sequence of (batch mode) supervised learning problems. Within this framework we describe the use of several classical tree-based supervised learning methods (CART, Kd-tree, tree bagging) and two newly proposed ensemble algorithms, namely extremely and totally randomized trees. We study their performances on several examples and find that the ensemble methods based on regression trees perform well in extracting relevant information about the optimal control policy from sets of four-tuples. In particular, the totally randomized trees give good results while ensuring the convergence of the sequence, whereas by relaxing the convergence constraint even better accuracy results are provided by the extremely randomized trees. [less ▲]

Detailed reference viewed: 347 (45 ULg)
Full Text
Peer Reviewed
See detailApproximate value iteration in the reinforcement learning context. Application to electrical power system control
Ernst, Damien ULg; Glavic, Mevludin; Geurts, Pierre ULg et al

in International Journal of Emerging Electrical Power Systems (2005), 3(1),

In this paper we explain how to design intelligent agents able to process the information acquired from interaction with a system to learn a good control policy and show how the methodology can be applied ... [more ▼]

In this paper we explain how to design intelligent agents able to process the information acquired from interaction with a system to learn a good control policy and show how the methodology can be applied to control some devices aimed to damp electrical power oscillations. The control problem is formalized as a discrete-time optimal control problem and the information acquired from interaction with the system is a set of samples, where each sample is composed of four elements: a state, the action taken while being in this state, the instantaneous reward observed and the successor state of the system. To process this information we consider reinforcement learning algorithms that determine an approximation of the so-called Q-function by mimicking the behavior of the value iteration algorithm. Simulations are first carried on a benchmark power system modeled with two state variables. Then we present a more complex case study on a four-machine power system where the reinforcement learning algorithm controls a Thyristor Controlled Series Capacitor (TCSC) aimed to damp power system oscillations. [less ▲]

Detailed reference viewed: 39 (4 ULg)
Full Text
Peer Reviewed
See detailRandom Subwindows for Robust Image Classification
Marée, Raphaël ULg; Geurts, Pierre ULg; Piater, Justus ULg et al

in Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR 2005) (2005)

We present a novel, generic image classification method based on a recent machine learning algorithm (ensembles of extremely randomized decision trees). Images are classified using randomly extracted ... [more ▼]

We present a novel, generic image classification method based on a recent machine learning algorithm (ensembles of extremely randomized decision trees). Images are classified using randomly extracted subwindows that are suitably normalized to yield robustness to certain image transformations. Our method is evaluated on four very different, publicly available datasets (COIL-100, ZuBuD, ETH-80, WANG). Our results show that our automatic approach is generic and robust to illumination, scale, and viewpoint changes. An extension of the method is proposed to improve its robustness with respect to rotation changes. [less ▲]

Detailed reference viewed: 67 (7 ULg)
Full Text
Peer Reviewed
See detailDecision Trees and Random Subwindows for Object Recognition
Marée, Raphaël ULg; Geurts, Pierre ULg; Piater, Justus ULg et al

in ICML workshop on Machine Learning Techniques for Processing Multimedia Content (MLMM2005) (2005)

In this paper, we compare five tree-based machine learning methods within a recent generic image classification framework based on random extraction and classification of subwindows. We evaluate them on ... [more ▼]

In this paper, we compare five tree-based machine learning methods within a recent generic image classification framework based on random extraction and classification of subwindows. We evaluate them on three publicly available object recognition datasets (COIL-100, ETH-80, and ZuBuD). Our comparison shows that this general and conceptually simple framework yields good results when combined with ensemble of decision trees, especially when using Tree Boosting or Extra-Trees. The latter is also particularly attractive in terms of computational efficiency. [less ▲]

Detailed reference viewed: 55 (2 ULg)
Full Text
Peer Reviewed
See detailSegment and combine approach for Biological Sequence Classification
Geurts, Pierre ULg; Blanco Cuesta, Antia; Wehenkel, Louis ULg

in Proc. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2005) (2005)

This paper presents a new algorithm based on the segment and combine paradigm, for automatic classification of biological sequences. It classifies sequences by aggregating the information about their ... [more ▼]

This paper presents a new algorithm based on the segment and combine paradigm, for automatic classification of biological sequences. It classifies sequences by aggregating the information about their subsequences predicted by a classifier derived by machine learning from a random sample of training subsequences. This generic approach is combined with decision tree based ensemble methods, scalable both with respect to sample size and vocabulary size. The method is applied to three families of problems: DNA sequence recognition, splice junction detection, and gene regulon prediction. With respect to standard approaches based on n-grams, it appears competitive in terms of accuracy, flexibility, and scalability. The paper also highlights the possibility to exploit the resulting models to identify interpretable patterns specific of a given class of biological sequences. [less ▲]

Detailed reference viewed: 37 (3 ULg)
Full Text
See detailBias vs. variance decomposition for regression and classification
Geurts, Pierre ULg

in Maimon, O.; Rokach, L. (Eds.) Data Mining and Knowledge Discovery Handbook: A Complete Guide for Practitioners and Researchers (2005)

In this chapter, the important concepts of bias and variance are introduced. After an intuitive introduction to the bias/variance tradeoff, we discuss the bias/variance decompositions of the mean square ... [more ▼]

In this chapter, the important concepts of bias and variance are introduced. After an intuitive introduction to the bias/variance tradeoff, we discuss the bias/variance decompositions of the mean square error (in the context of regression problems) and of the mean misclassification error (in the context of classification problems). Then, we carry out a small empirical study providing some insight about how the parameters of a learning algorithm nfluence bias and variance. [less ▲]

Detailed reference viewed: 103 (8 ULg)
Full Text
Peer Reviewed
See detailProteomic mass spectra classification using decision tree based ensemble methods.
Geurts, Pierre ULg; Fillet, Marianne ULg; De Seny, Dominique ULg et al

in Bioinformatics (2005), 21(14), 3138-45

MOTIVATION: Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to ... [more ▼]

MOTIVATION: Modern mass spectrometry allows the determination of proteomic fingerprints of body fluids like serum, saliva or urine. These measurements can be used in many medical applications in order to diagnose the current state or predict the evolution of a disease. Recent developments in machine learning allow one to exploit such datasets, characterized by small numbers of very high-dimensional samples. RESULTS: We propose a systematic approach based on decision tree ensemble methods, which is used to automatically determine proteomic biomarkers and predictive models. The approach is validated on two datasets of surface-enhanced laser desorption/ionization time of flight measurements, for the diagnosis of rheumatoid arthritis and inflammatory bowel diseases. The results suggest that the methodology can handle a broad class of similar problems. [less ▲]

Detailed reference viewed: 64 (17 ULg)
Full Text
Peer Reviewed
See detailSegment and combine approach for non-parametric time-series classification
Geurts, Pierre ULg; Wehenkel, Louis ULg

in Lecture Notes in Computer Science (2005), 3721

This paper presents a novel, generic, scalable, autonomous, and flexible supervised learning algorithm for the classification of multivariate and variable length time series. The essential ingredients of ... [more ▼]

This paper presents a novel, generic, scalable, autonomous, and flexible supervised learning algorithm for the classification of multivariate and variable length time series. The essential ingredients of the algorithm are randomization, segmentation of time-series, decision tree ensemble based learning of subseries classifiers, combination of subseries classification by voting, and cross-validation based temporal resolution adaptation. Experiments are carried out with this method on 10 synthetic and real-world datasets. They highlight the good behavior of the algorithm on a large diversity of problems. Our results are also highly competitive with existing approaches from the literature. [less ▲]

Detailed reference viewed: 30 (5 ULg)
Full Text
Peer Reviewed
See detailBiomedical image classification with random subwindows and decision trees
Marée, Raphaël ULg; Geurts, Pierre ULg; Piater, Justus ULg et al

in Computer Vision for Biomedical Image Applications (2005)

In this paper, we address a problem of biomedical image classification that involves the automatic classification of x-ray images in 57 predefined classes with large intra-class variability. To achieve ... [more ▼]

In this paper, we address a problem of biomedical image classification that involves the automatic classification of x-ray images in 57 predefined classes with large intra-class variability. To achieve that goal, we apply and slightly adapt a recent generic method for image classification based on ensemble of decision trees and random subwindows. We obtain classification results close to the state of the art on a publicly available database of 10000 x-ray images. We also provide some clues to interpret the classification of each image in terms of subwindow relevance. [less ▲]

Detailed reference viewed: 95 (27 ULg)
Full Text
Peer Reviewed
See detailDiscovery of new rheumatoid arthritis biomarkers using the surface-enhanced laser desorption/ionization time-of-flight mass spectrometry ProteinChip approach.
De Seny, Dominique ULg; Fillet, Marianne ULg; Meuwis, Marie-Alice ULg et al

in Arthritis and Rheumatism (2005), 52(12), 3801-12

OBJECTIVE: To identify serum protein biomarkers specific for rheumatoid arthritis (RA), using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) technology ... [more ▼]

OBJECTIVE: To identify serum protein biomarkers specific for rheumatoid arthritis (RA), using surface-enhanced laser desorption/ionization time-of-flight mass spectrometry (SELDI-TOF-MS) technology. METHODS: A total of 103 serum samples from patients and healthy controls were analyzed. Thirty-four of the patients had a diagnosis of RA, based on the American College of Rheumatology criteria. The inflammation control group comprised 20 patients with psoriatic arthritis (PsA), 9 with asthma, and 10 with Crohn's disease. The noninflammation control group comprised 14 patients with knee osteoarthritis and 16 healthy control subjects. Serum protein profiles were obtained by SELDI-TOF-MS and compared in order to identify new biomarkers specific for RA. Data were analyzed by a machine learning algorithm called decision tree boosting, according to different preprocessing steps. RESULTS: The most discriminative mass/charge (m/z) values serving as potential biomarkers for RA were identified on arrays for both patients with RA versus controls and patients with RA versus patients with PsA. From among several candidates, the following peaks were highlighted: m/z values of 2,924 (RA versus controls on H4 arrays), 10,832 and 11,632 (RA versus controls on CM10 arrays), 4,824 (RA versus PsA on H4 arrays), and 4,666 (RA versus PsA on CM10 arrays). Positive results of proteomic analysis were associated with positive results of the anti-cyclic citrullinated peptide test. Our observations suggested that the 10,832 peak could represent myeloid-related protein 8. CONCLUSION: SELDI-TOF-MS technology allows rapid analysis of many serum samples, and use of decision tree boosting analysis as the main statistical method allowed us to propose a pattern of protein peaks specific for RA. [less ▲]

Detailed reference viewed: 64 (9 ULg)