References of "Joly, Arnaud"
     in
Bookmark and Share    
Full Text
See detailL1-based compression of random forest models
Joly, Arnaud ULg; Schnitzler, François ULg; Geurts, Pierre ULg et al

in Proceeding of the 21st Belgian-Dutch Conference on Machine Learning (2012, May 24)

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼]

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲]

Detailed reference viewed: 63 (27 ULg)
Full Text
See detailL1-based compression of random forest models
Joly, Arnaud ULg; Schnitzler, François ULg; Geurts, Pierre ULg et al

in 20th European Symposium on Artificial Neural Networks (2012, April)

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼]

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲]

Detailed reference viewed: 139 (36 ULg)
Full Text
See detailPruning randomized trees with L1-norm regularization
Joly, Arnaud ULg; Schnitzler, François ULg; Geurts, Pierre ULg et al

Poster (2011, November 29)

Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly ... [more ▼]

Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly huge depending on the dimension of the dataset. Here we propose a method to compress such ensemble using random tree induced space and L1-norm regularisation. This leads to a drastic pruning, preserving or improving the model accuracy. Moreover, our approach increases robustness with respect to the selection of complexity parameters. [less ▲]

Detailed reference viewed: 39 (11 ULg)
Full Text
See detailImprovement of randomized ensembles of trees for supervised learning in very high dimension
Joly, Arnaud ULg

Master's dissertation (2011)

Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however is the ... [more ▼]

Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however is the complexity of the models (i.e. the large number and size of trees) they produce to achieve good performances. In this work, several research directions are identified to address this problem. Among those, we have developed the following one. From a tree ensemble, one can extract a set of binary features, each one associated to a leaf or a node of a tree and being true for a given object only if it reaches the corresponding leaf or node when propagated in this tree. Given this representation, the prediction of an ensemble can be simply retrieved by linearly combining these characteristic features with appropriate weights. We apply a linear feature selection method, namely the monotone LASSO, on these features, in order to simplify the tree ensemble. A subtree will then be pruned as soon as the characteristic features corresponding to its constituting nodes are not selected in the linear model. Empirical experiments show that the combination of the monotone LASSO with features extracted from tree ensembles leads at the same time to a drastic reduction of the number of features and can improve the accuracy with respect to unpruned ensembles of trees. [less ▲]

Detailed reference viewed: 75 (26 ULg)