Prediction of E.Coli Promoter Sites Utilizing the Transcription Geometry with Time-Delay Neural Networks

M.G. Reese, M. Reczko, H. Bohr and S. Suhai
German Cancer Research Center
Department of Molecular Biophysics
Im Neuenheimer Feld 280
69120 Heidelberg
Germany
M.Reese@dkfz-heidelberg.de

Abstract:

A special neural network architecture is employed for predicting promoter sites on procaryotic genomes to high accuracy.

The network architecture of the time-delay type has incorporated the geometry of the transcription site in its linked receptive fields i. e. the position and content of the -10 and -35-boxes. The knowledge about the content of these boxes is included in the architecture by initializing the receptive fields with weights obtained by training networks that recognize these boxes. By examining the content of the receptive fields after training, important information about the actual statistics of the total set of promoter regions can be obtained.

Such a connected network like a processing system models the structural aspects of the problem through its own architecture. The advantage of this physical neural network is the ability of integrating biological knowledge to construct a suitable geometry and to get information vice versa from the inferred prediction system for the understanding of the biological transcription process.

The final neural network predicts the 6 known promoters on the whole genome of pBR322 being novel to the network with a false positive score of less than 0.2%. Results of a test set containing 120 promoter sequences and 3000 random sequences novel to the network reach a predicting score of 85% with a false positive rate of less than 1% and a correlation coefficient of 0.82. A comparison with procaryotic promoter recognition results reported in literature shows a significantly improved prediction accuracy, especially when applied in an independent test on complete genomes.

Back to List of Publications

Back to Martin's Corner