Reference : Supervised learning with decision tree-based methods in computational and systems biology
Scientific journals : Article
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/25745
Supervised learning with decision tree-based methods in computational and systems biology
English
Geurts, Pierre mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Irrthum, Alexandre mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Wehenkel, Louis mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Dec-2009
Molecular Biosystems
5
12
1593-1605
Yes (verified by ORBi)
International
1742-206X
1742-2051
[en] Machine Learning ; Bioinformatics
[en] At the intersection between artificial intelligence and statistics, supervised learning provides
algorithms to automatically build predictive models only from observations of a system. During the
last twenty years, supervised learning has been a tool of choice to analyze the always increasing
and complexifying data generated in the context of molecular biology, with successful applications
in genome annotation, function prediction, or biomarker discovery. Among supervised learning
methods, decision tree-based methods stand out as non parametric methods that have the unique
feature of combining interpretability, efficiency, and, when used in ensembles of trees, excellent
accuracy. The goal of this paper is to provide an accessible and comprehensive introduction to this
class of methods. The first part of the paper is devoted to an intuitive but complete description of
decision tree-based methods and a discussion of their strengths and limitations with respect to other
supervised learning methods. The second part of the paper provides a survey of their applications
in the context of computational and systems biology.
The supplementary material provides information about various non-standard extensions of the
decision tree-based approach to modeling, some practical guidelines for the choice of parameters
and algorithm variants depending on the practical ob jectives of their application, pointers to freely
accessible software packages, and a brief primer going through the different manipulations needed
to use the tree-induction packages available in the R statistical tool.
http://hdl.handle.net/2268/25745

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Restricted access
geurts09-molecularbiosystems.pdfAuthor preprint317.18 kBRequest copy

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.