Benchmark Data Set for in Silico Prediction of Ames Mutagenicity
Journal of Chemical Information and Modeling2009Vol. 49(9), pp. 2077–2081
Citations Over TimeTop 1% of 2009 papers
Katja Hansen, Sebastian Mika, Timon Schroeter, Andreas Sutter, Antonius ter Laak, Thomas Steger‐Hartmann, Nikolaus Heinrich, Klaus‐Robert Müller
Abstract
Up to now, publicly available data sets to build and evaluate Ames mutagenicity prediction tools have been very limited in terms of size and chemical space covered. In this report we describe a new unique public Ames mutagenicity data set comprising about 6500 nonconfidential compounds (available as SMILES strings and SDF) together with their biological activity. Three commercial tools (DEREK, MultiCASE, and an off-the-shelf Bayesian machine learner in Pipeline Pilot) are compared with four noncommercial machine learning implementations (Support Vector Machines, Random Forests, k-Nearest Neighbors, and Gaussian Processes) on the new benchmark data set.
Related Papers
- → Performance of SMOTE in a random forest and naive Bayes classifier for imbalanced Hepatitis-B vaccination status(2021)9 cited
- → Prediction of Phishing Sites in Network using Naive Bayes compared over Random Forest with improved Accuracy(2023)6 cited
- → Analysis And Comparison Of Prediction Of Heart Disease Using Novel Random Forest And Naive Bayes Algorithm(2023)1 cited
- KLASIFIKASI DIABETES SUKU INDIAN PIMA MENGGUNAKANKOMBINASI METODE RANDOM FOREST DAN NAIVE BAYES(2020)
- → Performance Analysis of Heart Disease Prediction System using Novel Random Forest Over Naive Bayes Algorithm with an Improved Accuracy Rate(2023)