Dimensionality reduction is a commonly used step in machine learning, especially when dealing with a high dimensional space of features. The original feature space is mapped onto a new, reduced dimensioanllyity space and the examples to be used by machine learning algorithms are represented in that new space. The mapping is usually performed either by selecting a subset of the original features or/and by constructing some new features. This persentation deals with the first approach, feature subset selection. We provide a brief overview of the feature subset selection techniques that are commonly used in machine learning and give a more detailed description of feature subset selection used in machine learning on text data. Performance of some methods used is document categorization is illustrated by providing experimental comparison on real-world data collected from the Web.
Author: Dunja Mladeni?, Jožef Stefan Institute

Recent Videos RSS

Add to Favorites


1 Response(s) to this entry
Subscribe to comments with RSS.