Reference : On protocols and measures for the validation of supervised methods for the inference ...
Scientific journals : Article
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/159521
On protocols and measures for the validation of supervised methods for the inference of biological networks
English
Schrynemackers, Marie mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation > >]
Kuffner, Robert mailto [Ludwig-Maximilians-University, Munich, Germany > Institute for Practical Informatics and Bioinformatics > > >]
Geurts, Pierre mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation > >]
3-Dec-2013
Frontiers in genetics
4
262
Yes
International
[en] biological network inference ; biological network inference ; supervised learning ; supervised learning ; cross-validation ; cross-validation ; evaluation protocols ; evaluation protocols ; ROC curves ; ROC curves ; precision-recall curves ; precision-recall curves
[en] Networks provide a natural representation of molecular biology knowledge, in particular to model relationships between biological entities such as genes, proteins, drugs, or diseases. Because of the effort, the cost, or the lack of the experiments necessary for the elucidation of these networks, computational approaches for network inference have been frequently investigated in the literature. In this paper, we examine the assessment of supervised network inference. Supervised inference is based on machine learning techniques that infer the network from a training sample of known interacting and possibly non-interacting entities and additional measurement data. While these methods are very effective, their reliable validation in silico poses a challenge, since both prediction and validation need to be performed on the basis of the same partially known network. Cross-validation techniques need to be specifically adapted to classification problems on pairs of objects. We perform a critical review and assessment of protocols and measures proposed in the literature and derive specific guidelines how to best exploit and evaluate machine learning techniques for network inference. Through theoretical considerations and in silico experiments, we analyze in depth how important factors influence the outcome of performance estimation. These factors include the amount of information available for the interacting entities, the sparsity and topology of biological networks, and the lack of experimentally verified non-interacting pairs.
http://hdl.handle.net/2268/159521
10.3389/fgene.2013.00262
http://www.frontiersin.org/Journal/10.3389/fgene.2013.00262/abstract

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
schrynemackers13-frontiers.pdfPublisher postprint2.76 MBView/Open

Additional material(s):

File Commentary Size Access
Open access
schrynemackers13-supplementary.pdf114.39 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.