Accurate single-sequence prediction of solvent accessible surface area using local and global features
Citations Over TimeTop 23% of 2014 papers
Abstract
We present a new approach for predicting the Accessible Surface Area (ASA) using a General Neural Network (GENN). The novelty of the new approach lies in not using residue mutation profiles generated by multiple sequence alignments as descriptive inputs. Instead we use solely sequential window information and global features such as single-residue and two-residue compositions of the chain. The resulting predictor is both highly more efficient than sequence alignment-based predictors and of comparable accuracy to them. Introduction of the global inputs significantly helps achieve this comparable accuracy. The predictor, termed ASAquick, is tested on predicting the ASA of globular proteins and found to perform similarly well for so-called easy and hard cases indicating generalizability and possible usability for de-novo protein structure prediction. The source code and a Linux executables for GENN and ASAquick are available from Research and Information Systems at http://mamiris.com, from the SPARKS Lab at http://sparks-lab.org, and from the Battelle Center for Mathematical Medicine at http://mathmed.org.
Related Papers
- → Generalizing Generalizability in Information Systems Research(2003)1,544 cited
- → Applying Generalizability Theory using EduG(2011)108 cited
- → Reliability of observers' subjective impressions of families: A generalizability theory approach(2012)20 cited
- → Using Generalizability Theory for the Estimation of Reliability of a Patient Classification System(1994)1 cited