D. W. Aha and R. L. Bankert. A comparative evaluation of sequential feature selection algorithms. Artificial Intelligence and Statistics, V, 1996.
D.W. Aha and R. L. Bankert. Feature selection for case-based classification of cloud types: An empirical comparison. In Proceedings of the Conference on Artificial Intelligence (AAAI-94)-AAAI Press, 1994.
II. Almuallim and T. G. Dietterich. Learning with many irrelevant features. In Proceedings of the Ninth National Conference on Artificial Intelligence (AAAI-91), volume 2. AAAI Press, 1991.
Ron Bekkerman, Ran El-Yaniv, Naftali Tishby, and Yoad Winter. Distributional word clusters vs. words for text categorization. J. Mach. Learn. Res., 3:1183-1208, 2003.
A. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97:245-271, 1997.
R. Caruana and D. Freitag. Greedy attribute selection. In International Conference on Machine Learning, pages 28-36, 1994.
T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley, 1990.
Inderjit S. Dhillon, Subramanyam Mallela, and Rahul Kumar. A divisive information theoretic feature clustering algorithm for text classification. J. Mach. Learn. Res., 3:1265-1287, 2003.
I. Guyon and A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research, 3:1157-1182, 2003.
R. Kohavi and G. II. John. Wrappers for feature subset selection. Artificial Intelligence, 97(1-2):273-324, 1997.
D. Koller and M. Sahami. Toward optimal feature selection. In International Conference on Machine Learning, pages 284-292, 1996.
G. Provan and M. Singh. Learning bayesian networks using feature selection, 1995.
Lei Yu and Huan Liu. Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5:1205-1224, 2004.