Embedding Monte Carlo search of features in tree-based ensemble methods
English
Maes, Francis[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Geurts, Pierre[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Wehenkel, Louis[Université de Liège - ULg > Dép. d'électric., électron. et informat. (Inst.Montefiore) > Systèmes et modélisation >]
Sep-2012
Machine Learning and Knowledge Discovery in Data Bases
Flach, Peter
De Bie, Tijl
Cristianini, Nello
Springer
Lecture Notes in Artificial Intelligence
191-206
International
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
from 24-09-2012 to 28-09-2012
Bristol
United Kingdom
[en] Embedded Feature Generation ; Monte Carlo Search ; Decision Trees ; Random Forests ; Tree Boosting
[en] Feature generation is the problem of automatically constructing good features for a given target learning problem. While most feature generation algorithms belong either to the filter or to the wrapper approach, this paper focuses on embedded feature generation.
We propose a general scheme to embed feature generation in a wide range of tree-based learning algorithms, including single decision trees, random forests and tree boosting. It is based on the formalization of feature construction as a sequential decision making problem addressed by a tractable Monte Carlo search algorithm coupled with node splitting. This leads to fast algorithms that are applicable to large-scale problems. We empirically analyze the performances of these tree-based learners combined or not with the feature generation capability on several standard datasets.