Reference : Segment and combine approach for Biological Sequence Classification
Scientific congresses and symposiums : Paper published in a book
Engineering, computing & technology : Computer science
http://hdl.handle.net/2268/25763
Segment and combine approach for Biological Sequence Classification
English
Geurts, Pierre mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Blanco Cuesta, Antia [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst. Montefiore) > Systèmes et modélisation > >]
Wehenkel, Louis mailto [Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
2005
Proc. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2005)
194--201
Yes
No
International
IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB 2005)
14-15 Nov. 2005
San Diego
USA
[en] bioinformatics ; machine learning
[en] This paper presents a new algorithm based on the segment and combine paradigm, for automatic classification of biological sequences. It classifies sequences by aggregating the information about their subsequences predicted by a classifier derived by machine learning from a random sample of training subsequences. This generic approach is combined with decision tree based ensemble methods, scalable both with respect to sample size and vocabulary size. The method is applied to three families of problems: DNA sequence recognition, splice junction detection, and gene regulon prediction. With respect to standard approaches based on n-grams, it appears competitive in terms of accuracy, flexibility, and scalability. The paper also highlights the
possibility to exploit the resulting models to identify interpretable patterns specific of a given class of biological sequences.
http://hdl.handle.net/2268/25763
http://www.montefiore.ulg.ac.be/services/stochastic/pubs/2005/GBW05

File(s) associated to this reference

Fulltext file(s):

FileCommentaryVersionSizeAccess
Open access
geurts-cibcb2005.pdfPublisher postprint102.46 kBView/Open

Bookmark and Share SFX Query

All documents in ORBi are protected by a user license.