Combined alignments of sequences and domains characterize unknown proteins with remotely related protein search PSISearch2D

UniProt Protein sequencing Protein superfamily Multiple sequence alignment Structural alignment Smith–Waterman algorithm
DOI: 10.1093/database/baz092 Publication Date: 2019-06-16T07:07:32Z
ABSTRACT
Abstract Iterative homology search has been widely used in identification of remotely related proteins. Our previous study found that the query-seeded sequence iterative can reduce homologous over-extension errors and greatly improve selectivity. However, remains challenging protein functional prediction. More sensitive scoring models are highly needed to predictive performance alignment methods, annotation with better visualization also become imperative for result interpretation. Here we report an open-source application PSISearch2D runs detection. retrieves domain from Pfam, UniProtKB, CDD PROSITE resulting hits demonstrates combined alignments novel visualizations. A model called C-value is newly defined re-order consideration combination alignments. The benchmarking on use indicates outperforms original PSISearch2 tool terms both accuracy specificity. improves characterization unknown proteins remote evaluation tests show provided 77 695 139 503 bacteria 140 751 352 757 virus about 2.3-fold 1.8-fold more than PSISearch2, respectively. Together advanced features auto-iteration mode handle large-scale data optional programs global local alignments, enhances search.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (28)
CITATIONS (1)