Molecular Hashkeys: A Novel Method for Molecular Characterization and Its Application for Predicting Important Pharmaceutical Properties of Molecules
Citations Over TimeTop 10% of 1999 papers
Abstract
We define a novel numerical molecular representation, called the molecular hashkey, that captures sufficient information about a molecule to predict pharmaceutically interesting properties directly from three-dimensional molecular structure. The molecular hashkey represents molecular surface properties as a linear array of pairwise surface-based comparisons of the target molecule against a common 'basis-set' of molecules. Hashkey-measured molecular similarity correlates well with direct methods of measuring molecular surface similarity. Using a simple machine-learning technique with the molecular hashkeys, we show that it is possible to accurately predict the octanol-water partition coefficient, log P. Using more sophisticated learning techniques, we show that an accurate model of intestinal absorption for a set of drugs can be constructed using the same hashkeys used in the aforementioned experiments. Once a set of molecular hashkeys is calculated, its use in the training and testing of property-based models is very fast. Further, the required amount of data for model construction is very small. Neural network-based hashkey models trained on data sets as small as 30 molecules yield statistically significant prediction of molecular properties. The lack of a requirement for large data sets lends itself well to the prediction of pharmaceutically relevant molecular parameters for which data generation is expensive and slow. Molecular hashkeys coupled with machine-learning techniques can yield models that predict key pharmacological aspects of biologically important molecules and should therefore be important in the design of effective therapeutics.
Related Papers
- → Development of QSAR models for prediction of fish bioconcentration factors using physicochemical properties and molecular descriptors with machine learning algorithms(2021)34 cited
- → QSAR analysis of salicylamide isosteres with the use of quantum chemical molecular descriptors(2008)18 cited
- → Characterization of Skin Penetration Processes of Organic Molecules Using Molecular Similarity and QSAR Analysis(2004)17 cited
- → COMPUTATIONAL SCREENING AND QSAR STUDY ON A SERIES THEOPHYLLINE DERIVATIVES AS ALDH1A1 INHIBITORS(2021)4 cited
- → Chlorine Contribution to Quantitative Structure and Activity Relationship Models of Disinfection By-Products' Quantum Chemical Descriptors and Toxicities(2009)1 cited