Publication

UNSUPERVISED LEARNING APPROACH TO FEATURE SELECTION IN BIOLOGICAL DATA ANALYSIS

Publication, 2012

Outline

W. Jacak, K. Pröll - UNSUPERVISED LEARNING APPROACH TO FEATURE SELECTION IN BIOLOGICAL DATA ANALYSIS - Proceedings of the 24th European Modeling and Simulation Symposium EMSS 2012, Vienna, Austria, 2012, pp. 14-20

Abstract

In this paper we present a novel method for scoring function specification and feature selection by combining unsupervised learning with supervised cross validation. Unsupervised clustering methods (k-means, one dimensional Kohonen SOM, fuzzy c-means) are used to perform a clustering of object-data for a chosen subset of input features and given number of clusters. The resulting object clusters are compared with the predefined original object classes and a matching factor (score) is calculated. This score is used as criterion function for heuristic sequential feature selection and novel cross selection algorithm.