A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics

Shewanella oneidensis
DOI: 10.1093/bioinformatics/btn218 Publication Date: 2008-05-04T04:45:50Z
ABSTRACT
Abstract Motivation: The standard approach to identifying peptides based on accurate mass and elution time (AMT) compares profiles obtained from a high resolution spectrometer database of previously identified tandem spectrometry (MS/MS) studies. It would be advantageous, with respect both accuracy cost, only search for those that are detectable by MS (proteotypic). Results: We present support vector machine (SVM) model uses simple descriptor space 35 properties amino acid content, charge, hydrophilicity polarity the quantitative prediction proteotypic peptides. Using three independently derived AMT databases (Shewanella oneidensis, Salmonella typhimurium, Yersinia pestis) training validation within across species, SVM resulted in an average measure 0.8 SD <0.025. Furthermore, we demonstrate these results achievable small set 12 variables can achieve proteome coverage. Availability: http://omics.pnl.gov/software/STEPP.php Contact: bj@pnl.gov Supplementary information: data available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (41)
CITATIONS (45)