Publikation

UNSUPERVISED LEARNING APPROACH TO FEATURE SELECTION IN BIOLOGICAL DATA ANALYSIS

Outline:

W. Jacak, K. Pröll - UNSUPERVISED LEARNING APPROACH TO FEATURE SELECTION IN BIOLOGICAL DATA ANALYSIS - Proceedings of the 24th European Modeling and Simulation Symposium EMSS 2012, Vienna, Österreich, 2012, pp. 14-20

Abstract:

In this paper we present a novel method for scoring function specification and feature selection by combining unsupervised learning with supervised cross validation. Unsupervised clustering methods (k-means, one dimensional Kohonen SOM, fuzzy c-means) are used to perform a clustering of object-data for a chosen subset of input features and given number of clusters. The resulting object clusters are compared with the predefined original object classes and a matching factor (score) is calculated. This score is used as criterion function for heuristic sequential feature selection and novel cross selection algorithm.