Classification of conformational stability of protein mutants from 3D pseudo‐folding graph representation of protein sequences using support vector machines
Citations Over TimeTop 20% of 2007 papers
Abstract
This work reports a novel 3D pseudo-folding graph representation of protein sequences for modeling purposes. Amino acids euclidean distances matrices (EDMs) encode primary structural information. Amino Acid Pseudo-Folding 3D Distances Count (AAp3DC) descriptors, calculated from the EDMs of a large data set of 1363 single protein mutants of 64 proteins, were tested for building a classifier for the signs of the change of thermal unfolding Gibbs free energy change (DeltaDeltaG) upon single mutations. An optimum support vector machine (SVM) with a radial basis function (RBF) kernel well recognized stable and unstable mutants with accuracies over 70% in crossvalidation test. To the best of our knowledge, this result for stable mutant recognition is the highest ever reported for a sequence-based predictor with more than 1000 mutants. Furthermore, the model adequately classified mutations associated to diseases of human prion protein and human transthyretin.
Related Papers
- → Improving folding properties of computationally designed proteins(2019)8 cited
- → Diversity of Folding Pathways and Folding Models of Disulfide Proteins(2007)13 cited
- → Reshaping the Protein Folding Pathway by Osmolyte via its Effects on the Folding Intermediates(2015)10 cited
- → Comparing the Folding and Misfolding Energy Landscapes of Phosphoglycerate Kinase(2012)7 cited
- → Biosynthetic Protein Folding and Molecular Chaperons(2022)4 cited