Distance Matrix Predictions and Distance Statistics

M.G. Reese, O. Lund and J. Bohr
The Technical University of Denmark
Department of Physics
DK-2800 Lyngby, Denmark.

H. Bohr and S. Brunak
Center of Biological Sequence Analysis
The Technical University of Denmark
DK-2800 Lyngby, Denmark.

E-mail: reese@cbs.dtu.dk

Abstract:

We present a statistical analysis of protein structures based on inter atomic "C-alpha"-distances. The overall distance distributions reflect in detail the contents of sequence specific substructures maintained by local interactions (such as "alpha"-helices), and longer range interactions (like disulfide bridges and "beta"-sheets). The distance distributions are shown to be indicative for a given fold class. We also show that a volume scaling of the distances, makes distance distributions for protein chains of different length superimposeable. Distance distributions were also calculated specifically for amino acids separated by a given number of residues. Specific features in these distributions are visible for sequence separation up to 20 amino acid residues. A simple representation, which preserves most of the information in the distance distributions, was obtained using 6 parameters only. The parameters give rise to canonical distance intervals, and when predicting coarse grained distance constraints by methods like data driven artificial neural networks, these should preferably be selected from these intervals. We discuss the use of the 6 parameters for determining or reconstructing 3-dimensional protein structures.

Back to List of Publications

Back to Martin's Corner