References of "Joly, Arnaud"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailSimple connectome inference from partial correlation statistics in calcium imaging
Sutera, Antonio ULg; Joly, Arnaud ULg; François-Lavet, Vincent ULg et al

in Machine Learning and Knowledge Discovery in Databases (2014, June)

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to ... [more ▼]

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods. [less ▲]

Detailed reference viewed: 252 (61 ULg)
Full Text
See detailScikit-Learn: Machine Learning in the Python ecosystem
Joly, Arnaud ULg; Louppe, Gilles ULg

Poster (2014, January 27)

The scikit-learn project is an increasingly popular machine learning library written in Python. It is designed to be simple and efficient, useful to both experts and non-experts, and reusable in a variety ... [more ▼]

The scikit-learn project is an increasingly popular machine learning library written in Python. It is designed to be simple and efficient, useful to both experts and non-experts, and reusable in a variety of contexts. The primary aim of the project is to provide a compendium of efficient implementations of classic, well-established machine learning algorithms. Among other things, it includes classical supervised and unsupervised learning algorithms, tools for model evaluation and selection, as well as tools for data preprocessing and feature engineering. This presentation will illustrate the use of scikit-learn as a component of the larger scientific Python environment to solve complex data analysis tasks. Examples will include end-to-end workflows based on powerful and popular algorithms in the library. Among others, we will show how to use out-of-core learning with on-the-fly feature extraction to tackle very large natural language processing tasks, how to exploit an IPython cluster for distributed cross-validation, or how to build and use random forests to explore biological data. [less ▲]

Detailed reference viewed: 68 (11 ULg)
Full Text
Peer Reviewed
See detailAPI design for machine learning software: experiences from the scikit-learn project
Buitinck, Lars; Louppe, Gilles ULg; Blondel, Mathieu et al

Conference (2013, September 23)

scikit-learn is an increasingly popular machine learning library. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper ... [more ▼]

scikit-learn is an increasingly popular machine learning library. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library. [less ▲]

Detailed reference viewed: 658 (67 ULg)
Full Text
See detailL1-based compression of random forest models
Joly, Arnaud ULg; Schnitzler, François ULg; Geurts, Pierre ULg et al

in Proceeding of the 21st Belgian-Dutch Conference on Machine Learning (2012, May 24)

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼]

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲]

Detailed reference viewed: 166 (43 ULg)
Full Text
Peer Reviewed
See detailL1-based compression of random forest models
Joly, Arnaud ULg; Schnitzler, François ULg; Geurts, Pierre ULg et al

in 20th European Symposium on Artificial Neural Networks (2012, April)

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼]

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲]

Detailed reference viewed: 239 (55 ULg)
Full Text
See detailPruning randomized trees with L1-norm regularization
Joly, Arnaud ULg; Schnitzler, François ULg; Geurts, Pierre ULg et al

Poster (2011, November 29)

Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly ... [more ▼]

Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly huge depending on the dimension of the dataset. Here we propose a method to compress such ensemble using random tree induced space and L1-norm regularisation. This leads to a drastic pruning, preserving or improving the model accuracy. Moreover, our approach increases robustness with respect to the selection of complexity parameters. [less ▲]

Detailed reference viewed: 58 (16 ULg)
Full Text
See detailImprovement of randomized ensembles of trees for supervised learning in very high dimension
Joly, Arnaud ULg

Master's dissertation (2011)

Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however is the ... [more ▼]

Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however is the complexity of the models (i.e. the large number and size of trees) they produce to achieve good performances. In this work, several research directions are identified to address this problem. Among those, we have developed the following one. From a tree ensemble, one can extract a set of binary features, each one associated to a leaf or a node of a tree and being true for a given object only if it reaches the corresponding leaf or node when propagated in this tree. Given this representation, the prediction of an ensemble can be simply retrieved by linearly combining these characteristic features with appropriate weights. We apply a linear feature selection method, namely the monotone LASSO, on these features, in order to simplify the tree ensemble. A subtree will then be pruned as soon as the characteristic features corresponding to its constituting nodes are not selected in the linear model. Empirical experiments show that the combination of the monotone LASSO with features extracted from tree ensembles leads at the same time to a drastic reduction of the number of features and can improve the accuracy with respect to unpruned ensembles of trees. [less ▲]

Detailed reference viewed: 135 (37 ULg)