Neural Networks based Feature Selection for Biomarker Analysis


K. Pröll, W. Jacak, M. Epstein - Neural Networks based Feature Selection for Biomarker Analysis - Sixth International Workshop on Machine Learning in Systems Biology , Basel, Schweiz, 2012, pp. 1


The main objectives of machine leaning are data driven classification, problem solving and control. Classification of biological data means to develop a model that will divide biological observations into a set of predetermined classes. Typically a biological data set is composed of many variables (features) that represent measures of biological attributes in biological experiments. A common aspect of biological data is its high dimensionality that means the data dimension is high, but the sample size is relatively small. This phenomenon is called high dimensionality-small sample problem. The smaller the sample, the less accurate are the results of classification and the amount of error increases. Traditional statistical classification procedures such as discriminate analysis are built on the Bayesian decision theory. One major limitation of the statistical models is that they work well only when the underlying assumptions are satisfied. Users must have a good knowledge of both data properties and model capabilities before the models can be successfully applied. In this paper we propose the unsupervised learning neural network approach to find the best classifier for monitoring of GMO consumption.