References of "Joly, Arnaud"
     in
Bookmark and Share    
Full Text
Peer Reviewed
See detailSepsis prediction in critically ill patients by platelet activation markers on ICU admission: a prospective pilot study
LAYIOS, Nathalie ULiege; Delierneux, Céline ULiege; Hego, Alexandre ULiege et al

in Intensive Care Medicine Experimental (2017), 5(1), 32

Background: Platelets have been involved in both surveillance and host defense against severe infection. To date, whether platelet phenotype or other hemostasis components could be associated with ... [more ▼]

Background: Platelets have been involved in both surveillance and host defense against severe infection. To date, whether platelet phenotype or other hemostasis components could be associated with predisposition to sepsis in critical illness remains unknown. The aim of this work was to identify platelet markers that could predict sepsis occurrence in critically ill injured patients. Results: This single-center, prospective, observational, 7-month study was based on a cohort of 99 non-infected adult patients admitted to ICUs for elective cardiac surgery, trauma, acute brain injury and post-operative prolonged ventilation and followed up during ICU stay. Clinical characteristics and severity score (SOFA) were recorded on admission. Platelet activation markers, including fibrinogen binding to platelets, platelet membrane P-selectin expression, plasma soluble CD40L, and platelet-leukocytes aggregates were assayed by flow cytometry at admission and 48h later, and also at the time of sepsis diagnosis (Sepsis-3 criteria) and 7 days later for sepsis patients. Hospitalization data and outcomes were also recorded. Of the 99 patients, 19 developed sepsis after a median time of 5 days. SOFA at admission was higher; their levels of fibrinogen binding to platelets (platelet-Fg) and of D-dimers were significantly increased compared to the other patients. Levels 48h after ICU admission were no longer significant. Platelet-Fg % was an independent predictor of sepsis (P = 0.030). By ROC curve analysis cutoff points for SOFA (AUC=0.85) and Platelet-Fg (AUC=0.75) were 8 and 50%, respectively. The prior risk of sepsis (19%) increased to 50% when SOFA was above 8, to 46% when Platelet-Fg was above 50%, and to 87% when both SOFA and Platelet-Fg were above their cutoff values. By contrast, when the two parameters were below their cutoffs, the risk of sepsis was negligible (3.8%). Patients with sepsis had longer ICU and hospital stays and higher death rate. Conclusion: In addition to SOFA, platelet-bound fibrinogen levels assayed by flow cytometry within 24h of ICU admission help identifying critically ill patients at risk of developing sepsis. [less ▲]

Detailed reference viewed: 57 (14 ULiège)
Full Text
Peer Reviewed
See detailSimple connectome inference from partial correlation statistics in calcium imaging
Sutera, Antonio ULiege; Joly, Arnaud ULiege; François-Lavet, Vincent et al

in Soriano, Jordi; Battaglia, Demian; Guyon, Isabelle (Eds.) et al Neural Connectomics Challenge (2017)

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to ... [more ▼]

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods. [less ▲]

Detailed reference viewed: 145 (10 ULiège)
Full Text
See detailExploiting random projections and sparsity with random forests and gradient boosting methods - Application to multi-label and multi-output learning, random forest model compression and leveraging input sparsity
Joly, Arnaud ULiege

Doctoral thesis (2017)

Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output ... [more ▼]

Within machine learning, the supervised learning field aims at modeling the input-output relationship of a system, from past observations of its behavior. Decision trees characterize the input-output relationship through a series of nested ``if-then-else'' questions, the testing nodes, leading to a set of predictions, the leaf nodes. Several of such trees are often combined together for state-of-the-art performance: random forest ensembles average the predictions of randomized decision trees trained independently in parallel, while tree boosting ensembles train decision trees sequentially to refine the predictions made by the previous ones. The emergence of new applications requires scalable supervised learning algorithms in terms of computational power and memory space with respect to the number of inputs, outputs, and observations without sacrificing accuracy. In this thesis, we identify three main areas where decision tree methods could be improved for which we provide and evaluate original algorithmic solutions: (i) learning over high dimensional output spaces, (ii) learning with large sample datasets and stringent memory constraints at prediction time and (iii) learning over high dimensional sparse input spaces. A first approach to solve learning tasks with a high dimensional output space, called binary relevance or single target, is to train one decision tree ensemble per output. However, it completely neglects the potential correlations existing between the outputs. An alternative approach called multi-output decision trees fits a single decision tree ensemble targeting simultaneously all the outputs, assuming that all outputs are correlated. Nevertheless, both approaches have (i) exactly the same computational complexity and (ii) target extreme output correlation structures. In our first contribution, we show how to combine random projection of the output space, a dimensionality reduction method, with the random forest algorithm decreasing the learning time complexity. The accuracy is preserved, and may even be improved by reaching a different bias-variance tradeoff. In our second contribution, we first formally adapt the gradient boosting ensemble method to multi-output supervised learning tasks such as multi-output regression and multi-label classification. We then propose to combine single random projections of the output space with gradient boosting on such tasks to adapt automatically to the output correlation structure. The random forest algorithm often generates large ensembles of complex models thanks to the availability of a large number of observations. However, the space complexity of such models, proportional to their total number of nodes, is often prohibitive, and therefore these modes are not well suited under stringent memory constraints at prediction time. In our third contribution, we propose to compress these ensembles by solving a L1-based regularization problem over the set of indicator functions defined by all their nodes. Some supervised learning tasks have a high dimensional but sparse input space, where each observation has only a few of the input variables that have non zero values. Standard decision tree implementations are not well adapted to treat sparse input spaces, unlike other supervised learning techniques such as support vector machines or linear models. In our fourth contribution, we show how to exploit algorithmically the input space sparsity within decision tree methods. Our implementation yields a significant speed up both on synthetic and real datasets, while leading to exactly the same model. It also reduces the required memory to grow such models by exploiting sparse instead of dense memory storage for the input matrix. [less ▲]

Detailed reference viewed: 202 (22 ULiège)
Full Text
Peer Reviewed
See detailJoint learning and pruning of decision forests
Begon, Jean-Michel ULiege; Joly, Arnaud ULiege; Geurts, Pierre ULiege

Conference (2016, September 12)

Decision forests such as Random Forests and Extremely randomized trees are state-of-the-art supervised learning methods. Unfortunately, they tend to consume much memory space. In this work, we propose an ... [more ▼]

Decision forests such as Random Forests and Extremely randomized trees are state-of-the-art supervised learning methods. Unfortunately, they tend to consume much memory space. In this work, we propose an alternative algorithm to derive decision forests under heavy memory constraints. We show that under such constraints our method usually outperforms simpler baselines and can even sometimes beat the original forest. [less ▲]

Detailed reference viewed: 71 (11 ULiège)
Full Text
Peer Reviewed
See detailErratum: Elevated basal levels of circulating activated platelets predict ICU-acquired sepsis and mortality: a prospective study.
Layios, N.; Delierneux, Céline ULiege; Hego, A. et al

in Critical care (London, England) (2015), 19(1), 301

Detailed reference viewed: 31 (3 ULiège)
Full Text
Peer Reviewed
See detailErratum: Prospective immune profiling in critically ill adults: before, during and after severe sepsis and septic shock.
Layios, N.; GOSSET, Christian ULiege; Delierneux, Céline ULiege et al

in Critical care (London, England) (2015), 19(1), 300

Detailed reference viewed: 28 (6 ULiège)
Full Text
Peer Reviewed
See detailRandom forests with random projections of the output space for high dimensional multi-label classification
Joly, Arnaud ULiege; Geurts, Pierre ULiege; Wehenkel, Louis ULiege

in Machine Learning and Knowledge Discovery in Databases (2014, September 15)

We adapt the idea of random projections applied to the out- put space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can ... [more ▼]

We adapt the idea of random projections applied to the out- put space, so as to enhance tree-based ensemble methods in the context of multi-label classification. We show how learning time complexity can be reduced without affecting computational complexity and accuracy of predictions. We also show that random output space projections may be used in order to reach different bias-variance tradeoffs, over a broad panel of benchmark problems, and that this may lead to improved accuracy while reducing significantly the computational burden of the learning stage. [less ▲]

Detailed reference viewed: 414 (84 ULiège)
Full Text
Peer Reviewed
See detailSimple connectome inference from partial correlation statistics in calcium imaging
Sutera, Antonio ULiege; Joly, Arnaud ULiege; François-Lavet, Vincent ULiege et al

in Soriano, Jordi; Battaglia, Demian; Guyon, Isabelle (Eds.) et al Neural Connectomics Challenge (2014)

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to ... [more ▼]

In this work, we propose a simple yet effective solution to the problem of connectome inference in calcium imaging data. The proposed algorithm consists of two steps. First, processing the raw signals to detect neural peak activities. Second, inferring the degree of association between neurons from partial correlation statistics. This paper summarises the methodology that led us to win the Connectomics Challenge, proposes a simplified version of our method, and finally compares our results with respect to other inference methods. [less ▲]

Detailed reference viewed: 956 (177 ULiège)
Full Text
See detailScikit-Learn: Machine Learning in the Python ecosystem
Joly, Arnaud ULiege; Louppe, Gilles ULiege

Poster (2014, January 27)

The scikit-learn project is an increasingly popular machine learning library written in Python. It is designed to be simple and efficient, useful to both experts and non-experts, and reusable in a variety ... [more ▼]

The scikit-learn project is an increasingly popular machine learning library written in Python. It is designed to be simple and efficient, useful to both experts and non-experts, and reusable in a variety of contexts. The primary aim of the project is to provide a compendium of efficient implementations of classic, well-established machine learning algorithms. Among other things, it includes classical supervised and unsupervised learning algorithms, tools for model evaluation and selection, as well as tools for data preprocessing and feature engineering. This presentation will illustrate the use of scikit-learn as a component of the larger scientific Python environment to solve complex data analysis tasks. Examples will include end-to-end workflows based on powerful and popular algorithms in the library. Among others, we will show how to use out-of-core learning with on-the-fly feature extraction to tackle very large natural language processing tasks, how to exploit an IPython cluster for distributed cross-validation, or how to build and use random forests to explore biological data. [less ▲]

Detailed reference viewed: 244 (25 ULiège)
Full Text
Peer Reviewed
See detailAPI design for machine learning software: experiences from the scikit-learn project
Buitinck, Lars; Louppe, Gilles ULiege; Blondel, Mathieu et al

Conference (2013, September 23)

scikit-learn is an increasingly popular machine learning library. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper ... [more ▼]

scikit-learn is an increasingly popular machine learning library. Written in Python, it is designed to be simple and efficient, accessible to non-experts, and reusable in various contexts. In this paper, we present and discuss our design choices for the application programming interface (API) of the project. In particular, we describe the simple and elegant interface shared by all learning and processing units in the library and then discuss its advantages in terms of composition and reusability. The paper also comments on implementation details specific to the Python ecosystem and analyzes obstacles faced by users and developers of the library. [less ▲]

Detailed reference viewed: 900 (80 ULiège)
Full Text
See detailL1-based compression of random forest models
Joly, Arnaud ULiege; Schnitzler, François ULiege; Geurts, Pierre ULiege et al

in Proceeding of the 21st Belgian-Dutch Conference on Machine Learning (2012, May 24)

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼]

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲]

Detailed reference viewed: 280 (57 ULiège)
Full Text
Peer Reviewed
See detailL1-based compression of random forest models
Joly, Arnaud ULiege; Schnitzler, François ULiege; Geurts, Pierre ULiege et al

in 20th European Symposium on Artificial Neural Networks (2012, April)

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive ... [more ▼]

Random forests are effective supervised learning methods applicable to large-scale datasets. However, the space complexity of tree ensembles, in terms of their total number of nodes, is often prohibitive, specially in the context of problems with very high-dimensional input spaces. We propose to study their compressibility by applying a L1-based regularization to the set of indicator functions defined by all their nodes. We show experimentally that preserving or even improving the model accuracy while significantly reducing its space complexity is indeed possible. [less ▲]

Detailed reference viewed: 466 (80 ULiège)
Full Text
See detailPruning randomized trees with L1-norm regularization
Joly, Arnaud ULiege; Schnitzler, François ULiege; Geurts, Pierre ULiege et al

Poster (2011, November 29)

Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly ... [more ▼]

Growing amount of high dimensional data requires robust analysis techniques. Tree-based ensemble methods provide such accurate supervised learning models. However, the model complexity can become utterly huge depending on the dimension of the dataset. Here we propose a method to compress such ensemble using random tree induced space and L1-norm regularisation. This leads to a drastic pruning, preserving or improving the model accuracy. Moreover, our approach increases robustness with respect to the selection of complexity parameters. [less ▲]

Detailed reference viewed: 93 (27 ULiège)
Full Text
See detailImprovement of randomized ensembles of trees for supervised learning in very high dimension
Joly, Arnaud ULiege

Master's dissertation (2011)

Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however is the ... [more ▼]

Tree-based ensemble methods, such as random forests and extremely randomized trees, are methods of choice for handling high dimensional problems. One important drawback of these methods however is the complexity of the models (i.e. the large number and size of trees) they produce to achieve good performances. In this work, several research directions are identified to address this problem. Among those, we have developed the following one. From a tree ensemble, one can extract a set of binary features, each one associated to a leaf or a node of a tree and being true for a given object only if it reaches the corresponding leaf or node when propagated in this tree. Given this representation, the prediction of an ensemble can be simply retrieved by linearly combining these characteristic features with appropriate weights. We apply a linear feature selection method, namely the monotone LASSO, on these features, in order to simplify the tree ensemble. A subtree will then be pruned as soon as the characteristic features corresponding to its constituting nodes are not selected in the linear model. Empirical experiments show that the combination of the monotone LASSO with features extracted from tree ensembles leads at the same time to a drastic reduction of the number of features and can improve the accuracy with respect to unpruned ensembles of trees. [less ▲]

Detailed reference viewed: 240 (54 ULiège)