A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins
Citations Over Time
Abstract
Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We represent proteins as points in a generalized hydropathy space, represented by vectors of specifically defined features. The features are derived from hydropathy of the individual amino acids. Projection of this space onto principal axes reveals groups of proteins with related hydropathy distributions. The groups identified correspond well to families of structurally and functionally related proteins. We found that this method accurately identifies protein families in a set of proteins, or subfamilies in a set of homologous proteins. Our results show that protein families can be identified by the analysis of hydropathy distribution, without the need for sequence alignment.
Related Papers
- → PALI--a database of Phylogeny and ALIgnment of homologous protein structures(2001)103 cited
- → SUPFAM--a database of potential protein superfamily relationships derived by comparing sequence-based and structure-based families: implications for structural genomics and function annotation in genomes(2002)51 cited
- → Integration of related sequences with protein three-dimensional structural families in an updated version of PALI database(2003)46 cited
- → Identification of Local Conformational Similarity in Structurally Variable Regions of Homologous Proteins Using Protein Blocks(2011)10 cited
- → Identification of homologous core structures(1999)58 cited