![]() |
|
Functional Sites Structural Search Engine | [ About SiteEngine][ Server Help][ Download] |
Recognizes regions on the surface of one protein that resemble a specific binding site of another. |
Abstract:
Recognition of regions through which protein molecules function and
interact is crucial for prediction of molecular interactions which
govern most of the cellular processes. We present a novel method,
SiteEngine, that recognizes regions on the surface of
one protein that resemble a specific binding site of another. This may
suggest the similarity of their binding partners and biological
functions. Unlike methods that compare the locations of the backbone
atoms or the identity of the amino acids, the presented method takes
into account the physico-chemical properties of both the backbone and
the side-chains. Therefore it can recognize similar binding patterns
shared by proteins that have no sequence or fold
similarity. SiteEngine is highly efficient and suitable for large
scale database searches of the entire PDB.
The input for SiteEngine consists of a binding site of one protein and a complete structure of another protein. The molecular surface shell of the complete molecule is searched for the presence of a region which is similar to the input binding site. The method is based on efficient hashing and matching of triangles of centers of physico-chemical properties and can search large protein structures in a matter of seconds. It investigates every candidate transformation that superimposes at least three centers of physico-chemical properties. We introduce a low-resolution representation by chemically important surface points and efficiently score these solutions to filters out the biologically irrelevant ones. Then, as the number of potential solutions is reduced to a smaller subset, the resolution of the molecular representation is increased, leading to a more precise comparison of the geometrical and physico-chemical properties.
The biological significance of the SiteEngine method is validated on a set of biological applications. First, we introduce a benchmark dataset which is used to construct two databases: one of complete protein structures and the other of binding sites. These databases are used to perform three types of search applications: (1) A given functional site is searched against a large set of complete protein structures; (2) A potential functional site of a protein of interest is compared with known binding sites; (3) A complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. While the second application compares between already known binding sites, the first and the third can recognize novel regions that can function as binding sites. Our method is efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first and the second applications may identify secondary binding sites of drugs that may lead to side effects. The third application finds new regions that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns.
In each application SiteEngine has successfully recognized specific types of protein binding sites such as estradiol binding, adenine and ATP binding sites that were used as queries. The same binding sites were further used to search the ASTRAL dataset constructed from the entire PDB. Since SiteEngine searches a complete structure of each protein in a matter of seconds, we find the first application to be the most reliable for such large scale applications. The method was also applied to classification and functional annotation of novel proteins determined as part of the Structural Genomics project.
Short Presentation: with graphical explanations of the method and its applications.
Scoring functions: the technical details of implementation.
Contact: shulmana@tau.ac.il