A maximum common substructure-based algorithm for searching and predicting drug-like compounds

Similarity (geometry) Similarity measure Basis (linear algebra) Cheminformatics Vectorization (mathematics) Substructure Biological data
DOI: 10.1093/bioinformatics/btn186 Publication Date: 2008-06-27T07:43:13Z
ABSTRACT
The prediction of biologically active compounds is great importance for high-throughput screening (HTS) approaches in drug discovery and chemical genomics. Many computational methods this area focus on measuring the structural similarities between structures. However, traditional similarity measures are often too rigid or consider only global maximum common substructure (MCS) approach provides a more promising flexible alternative predicting bioactive compounds.In article, new backtracking algorithm MCS proposed compared to measurements. Our high flexibility matching process, it very efficient identifying local similarities. To predict cluster efficiently, concept basis that enables researchers easily combine MCS-based with modern machine learning techniques. Support vector machines (SVMs) used test how measure compound vectorization method perform two empirically tested datasets. results show complements well-known atom pair descriptor-based measure. By combining these measures, our SVM-based model predicts biological activities higher specificity sensitivity.Supplementary data available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (46)
CITATIONS (157)