Abstract
The computational prediction of regulatory components in genomic DNA is an attractive and complex research field. The main interest is in finding protein coding genes in long stretches of non-mapped DNA. A particularly important segment of gene finding is the location of promoters - a specific group of regulatory components that are just at the beginning of the gene and which initiate the DNA transcription process. The computational methods for promoter recognition are not sufficiently developed yet. Current methods are prone to produce a large number offalse predictions. We present a new method based on clustering the PCA transformed DNA data with further signal processing of the clustered data. The basic technical system consists of eleven neural networks (one SOM ANN and ten GRNNs). On an independent test set the system shows an increased accuracy of recognition with a reduced level offalse positive reporting. A special method of data separation into the training set and test set is used. The results achieved with the extended system appear to be currently the best in the class of those that use neural networks for promoter recognition.