Browse ORBi by ORBi project

- Background
- Content
- Benefits and challenges
- Legal aspects
- Functions and services
- Team
- Help and tutorials

Mixtures of Bagged Markov Tree Ensembles Schnitzler, François ; Geurts, Pierre ; Wehenkel, Louis in Cano Utrera, Andrès; Gómez-Olmedo, Manuel; Nielsen, Thomas (Eds.) Proceedings of the 6th European Workshop on Probabilistic Graphical Models (2012, September) Markov trees, a probabilistic graphical model for density estimation, can be expanded in the form of a weighted average of Markov Trees. Learning these mixtures or ensembles from observations can be ... [more ▼] Markov trees, a probabilistic graphical model for density estimation, can be expanded in the form of a weighted average of Markov Trees. Learning these mixtures or ensembles from observations can be performed to reduce the bias or the variance of the estimated model. We propose a new combination of both, where the upper level seeks to reduce bias while the lower level seeks to reduce variance. This algorithm is evaluated empirically on datasets generated from a mixture of Markov trees and from other synthetic densities. [less ▲] Detailed reference viewed: 84 (6 ULg)L1-based compression of random forest models Joly, Arnaud ; Schnitzler, François ; Geurts, Pierre et al in Proceeding of the 21st Belgian-Dutch Conference on Machine Learning (2012, May 24) Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼] Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲] Detailed reference viewed: 236 (57 ULg)Approximation efficace de mélanges bootstrap d’arbres de Markov pour l’estimation de densité Schnitzler, François ; ; et al in Bougrain, Laurent (Ed.) Actes de la 14e Conférence Francophone sur l'Apprentissage Automatique (CAp 2012) (2012, May 23) Nous considérons des algorithmes pour apprendre des Mélanges bootstrap d'Arbres de Markov pour l'estimation de densité. Pour les problèmes comportant un grand nombre de variables et peu d'observations ... [more ▼] Nous considérons des algorithmes pour apprendre des Mélanges bootstrap d'Arbres de Markov pour l'estimation de densité. Pour les problèmes comportant un grand nombre de variables et peu d'observations, ces mélanges estiment généralement mieux la densité qu'un seul arbre appris au maximum de vraisemblance, mais sont plus coûteux à apprendre. C'est pourquoi nous étudions ici un algorithme pour apprendre ces modèles de manière approchée, afin d'accélérer l'apprentissage sans sacrifier la précision. Plus spécifiquement, nous récupérons lors du calcul d'un premier arbre de Markov les arcs qui constituent de bons candidats pour la structure, et ne considérons que ceux-ci lors de l'apprentissage des arbres suivants. Nous comparons cet algorithme à l'algorithme original de mélange, à un arbre appris au maximum de vraisemblance, à un arbre régularisé et à une autre méthode approchée. [less ▲] Detailed reference viewed: 39 (4 ULg)L1-based compression of random forest models Joly, Arnaud ; Schnitzler, François ; Geurts, Pierre et al in 20th European Symposium on Artificial Neural Networks (2012, April) Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼] Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲] Detailed reference viewed: 380 (72 ULg)Efficiently Approximating Markov Tree Bagging for High-Dimensional Density Estimation Schnitzler, François Scientific conference (2011, December 07) Detailed reference viewed: 16 (2 ULg)Pruning randomized trees with L1-norm regularization Joly, Arnaud ; Schnitzler, François ; Geurts, Pierre et al Poster (2011, November 29) Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly ... [more ▼] Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly huge depending on the dimension of the dataset. Here we propose a method to compress such ensemble using random tree induced space and L1-norm regularisation. This leads to a drastic pruning, preserving or improving the model accuracy. Moreover, our approach increases robustness with respect to the selection of complexity parameters. [less ▲] Detailed reference viewed: 80 (27 ULg)Efficiently approximating Markov tree bagging for high-dimensional density estimation Schnitzler, François ; ; et al in Gunopulos, Dimitrios; Hofmann, Thomas; Malerba, Donato (Eds.) et al Machine Learning and Knowledge Discovery in Databases, Part III (2011, September) We consider algorithms for generating Mixtures of Bagged Markov Trees, for density estimation. In problems deﬁned over many variables and when few observations are available, those mixtures generally ... [more ▼] We consider algorithms for generating Mixtures of Bagged Markov Trees, for density estimation. In problems deﬁned over many variables and when few observations are available, those mixtures generally outperform a single Markov tree maximizing the data likelihood, but are far more expensive to compute. In this paper, we describe new algorithms for approximating such models, with the aim of speeding up learning without sacriﬁcing accuracy. More speciﬁcally, we propose to use a ﬁltering step obtained as a by-product from computing a ﬁrst Markov tree, so as to avoid considering poor candidate edges in the subsequently generated trees. We compare these algorithms (on synthetic data sets) to Mixtures of Bagged Markov Trees, as well as to a single Markov tree derived by the classical Chow-Liu algorithm and to a recently proposed randomized scheme used for building tree mixtures. [less ▲] Detailed reference viewed: 80 (23 ULg)Two-level Mixtures of Markov Trees Schnitzler, François ; Wehenkel, Louis Poster (2011, June 29) We study algorithms for learning Mixtures of Markov Trees for density estimation. There are two approaches to build such mixtures, which both exploit the interesting scaling properties of Markov Trees. We ... [more ▼] We study algorithms for learning Mixtures of Markov Trees for density estimation. There are two approaches to build such mixtures, which both exploit the interesting scaling properties of Markov Trees. We investigate whether the maximum likelihood and the variance reduction approaches can be combined together by building a two level Mixture of Markov Trees. Our experiments on synthetic data sets show that this two-level model outperforms the maximum likelihood one. [less ▲] Detailed reference viewed: 36 (10 ULg)Looking for applications of mixtures of Markov trees in bioinformatics Schnitzler, François ; Geurts, Pierre ; Wehenkel, Louis Scientific conference (2011, March 21) Probabilistic graphical models (PGM) eﬃciently encode a probability distribution on a large set of variables. While they have already had several successful applications in biology, their poor scaling in ... [more ▼] Probabilistic graphical models (PGM) eﬃciently encode a probability distribution on a large set of variables. While they have already had several successful applications in biology, their poor scaling in terms of the number of variables may make them unﬁt to tackle problems of increasing size. Mixtures of trees however scale well by design. Experiments on synthetic data have shown the interest of our new learning methods for this model, and we now wish to apply them to relevant problems in bioinformatics. [less ▲] Detailed reference viewed: 36 (12 ULg)Outcome of pregnancy in women with inflammatory bowel disease treated with antitumor necrosis factor therapy. Schnitzler, François ; ; Boukerroucha, Meriem et al in Inflammatory Bowel Diseases (2011), 17(9), 1846-1854 BACKGROUND:: Infliximab (IFX) and adalimumab (ADA) are attractive treatment options in patients with inflammatory bowel disease (IBD) also during pregnancy but there is still limited data on the benefit ... [more ▼] BACKGROUND:: Infliximab (IFX) and adalimumab (ADA) are attractive treatment options in patients with inflammatory bowel disease (IBD) also during pregnancy but there is still limited data on the benefit/risk profile of IFX and ADA during pregnancy. METHODS:: This observational study assessed pregnancy outcomes in 212 women with IBD under antitumor necrosis factor alpha (TNF) treatment at our IBD unit. Pregnancy outcomes in 42 pregnancies with direct exposure to anti-TNF treatment (35 IFX, 7 ADA) were compared with that in 23 pregnancies prior to IBD diagnosis, 78 pregnancies before start of IFX, 53 pregnancies with indirect exposure to IFX, and 56 matched pregnancies in healthy women. RESULTS:: Thirty-two of the 42 pregnancies ended in live births with a median gestational age of 38 weeks (interquartile range [IQR] 37-39). There were seven premature deliveries, six children had low birth weight, and there was one stillbirth. One boy weighed 1640 g delivered at week 33, died at age of 13 days because of necrotizing enterocolitis. A total of eight abortions (one patient wish) occurred in seven women. Trisomy 18 was diagnosed in one fetus of a mother with CD at age 37 under ADA treatment (40 mg weekly) and pregnancy was terminated. Pregnancy outcomes after direct exposure to anti-TNF treatment were not different from those in pregnancies before anti-TNF treatment or with indirect exposure to anti-TNF treatment but outcomes were worse than in pregnancies before IBD diagnosis. CONCLUSIONS:: Direct exposure to anti-TNF treatment during pregnancy was not related to a higher incidence of adverse pregnancy outcomes than IBD overall. (Inflamm Bowel Dis 2011;). [less ▲] Detailed reference viewed: 63 (24 ULg)Optimal sample selection for batch-mode reinforcement learning Rachelson, Emmanuel ; Schnitzler, François ; Wehenkel, Louis et al in Proceedings of the 3rd International Conference on Agents and Artificial Intelligence (ICAART 2011) (2011) We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time Optimal Control problems. This meta-algorithm maps the problem of ﬁnding a near-optimal closed-loop policy to the ... [more ▼] We introduce the Optimal Sample Selection (OSS) meta-algorithm for solving discrete-time Optimal Control problems. This meta-algorithm maps the problem of ﬁnding a near-optimal closed-loop policy to the identiﬁcation of a small set of one-step system transitions, leading to high-quality policies when used as input of a batch-mode Reinforcement Learning (RL) algorithm. We detail a particular instance of this OSS metaalgorithm that uses tree-based Fitted Q-Iteration as a batch-mode RL algorithm and Cross Entropy search as a method for navigating efﬁciently in the space of sample sets. The results show that this particular instance of OSS algorithms is able to identify rapidly small sample sets leading to high-quality policies [less ▲] Detailed reference viewed: 104 (14 ULg)Discussing the validation of high-dimensional probability distribution learning with mixtures of graphical models for inference Schnitzler, François Poster (2010, October 06) Exact inference on probabilistic graphical models quickly becomes intractable when the dimension of the problem increases. A weighted average (or mixture) of different simple graphical models can be used ... [more ▼] Exact inference on probabilistic graphical models quickly becomes intractable when the dimension of the problem increases. A weighted average (or mixture) of different simple graphical models can be used instead of a more complicated model to learn a distribution, allowing probabilistic inference to be much more efficient. I hope to discuss issues related to the validation of algorithms for learning such mixtures of models and to high-dimensional learning of probabilistic graphical models in general, and to gather valuable feedback and comments on my approach. The main problems are the difficulties to assess the accuracy of the algorithms and to choose a representative set of target distributions. The accuracy of algorithms for learning probabilistic graphical models is often evaluated by comparing the structure of the resulting model to the target (e.g. Number of similar/dissimilar edges, score BDe etc). This approach however falls short when studying methods using a mixture of simple models : individually, these lack the representation power to model the true distribution, and only their combination allows them to compete with more sophisticated models. The Kullback-Leibler divergence is a measure of the difference between two probability densities, and can be used to compare any model learned from a dataset to the data generating distribution. For computational reasons, I however had to resort to a Monte Carlo estimation of this quantity for large problems (starting at around 200 variables). Since probabilistic inference is the ultimate motivation for building these models, and not probability modelling, a more meaningful measure of accuracy could be obtained by comparing mixtures against a combination of state of the art model learning and approximate inference algorithms. However, the exact inference result cannot be easily assessed for interesting target distributions, since the use of mixtures is precisely considered because exact inference is not possible on said targets, and approximate inference would introduce a bias. Selecting a target distribution used to generate the data sets on which the algorithms are evaluated also proved a challenge. The easiest solution was to generate them at random (although different approaches can be designed). These models are however likely to be rather different from real problems, and thus constitute a poor choice to assess the practical interest of mixture of models. Methods (e.g. linking multiple copies of a given network) have been developed to increase the size of models known by the community (e.g. the alarm network), and the obtained graphical models have been made available. These could however still be far from the kind of interactions present in a real setting. A better way to proceed could be to generate samples based on the equations describing a physical problem, to learn a probabilistic model as best as possible from this high-dimensional dataset, and to use it as target distribution. [less ▲] Detailed reference viewed: 52 (4 ULg)Sub-quadratic Markov tree mixture learning based on randomizations of the Chow-Liu algorithm ; ; Schnitzler, François et al in Myllymäki, Petri; Roos, Antoine; Jaakkola, Tommi (Eds.) Proceedings of the Fifth European Workshop on Probabilistic Graphical Models (PGM-2010) (2010, September) The present work analyzes diﬀerent randomized methods to learn Markov tree mixtures for density estimation in very high-dimensional discrete spaces (very large number n of discrete variables) when the ... [more ▼] The present work analyzes diﬀerent randomized methods to learn Markov tree mixtures for density estimation in very high-dimensional discrete spaces (very large number n of discrete variables) when the sample size (N ) is very small compared to n. Several sub- quadratic relaxations of the Chow-Liu algorithm are proposed, weakening its search proce- dure. We ﬁrst study na¨ıve randomizations and then gradually increase the deterministic behavior of the algorithms by trying to focus on the most interesting edges, either by retaining the best edges between models, or by inferring promising relationships between variables. We compare these methods to totally random tree generation and randomiza- tion based on bootstrap-resampling (bagging), of respectively linear and quadratic com- plexity. Our results show that randomization becomes increasingly more interesting for smaller N/n ratios, and that methods based on simultaneously discovering and exploiting the problem structure are promising in this context. [less ▲] Detailed reference viewed: 40 (19 ULg)Vers un apprentissage subquadratique pour les mélanges d’arbres Schnitzler, François ; ; Wehenkel, Louis Conference (2010, May 10) We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learn- ing probability density ... [more ▼] We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learn- ing probability density models in the form of mixtures of Markov trees. Our empirical study on high-dimensional synthetic problems shows that, while bagging is the most accurate scheme on average, some of the stronger randomizations remain very competitive in terms of accuracy, specially for small sample sizes. [less ▲] Detailed reference viewed: 45 (16 ULg)Towards sub-quadratic learning of probability density models in the form of mixtures of trees Schnitzler, François ; ; Wehenkel, Louis (2010, April) We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learning probability density ... [more ▼] We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learning probability density models in the form of mixtures of Markov trees. Our empirical study on high-dimensional synthetic problems shows that, while bagging is the most accurate scheme on average, some of the stronger randomizations remain very competitive in terms of accuracy, specially for small sample sizes. [less ▲] Detailed reference viewed: 70 (21 ULg)Constraint Based Learning of Mixtures of Trees Schnitzler, François ; Wehenkel, Louis Conference (2009) Mixtures of trees can be used to model any multivariate distributions. In this work the possibility to learn these models from data by causal learning is explored. The algorithm developed aims at ... [more ▼] Mixtures of trees can be used to model any multivariate distributions. In this work the possibility to learn these models from data by causal learning is explored. The algorithm developed aims at approximating all ﬁrst order relationships between pairs of variables by a mixture of a given size. This approach is evaluated based on synthetic data, and seems promising. [less ▲] Detailed reference viewed: 32 (7 ULg) |
||