Integrating multi-attribute similarity networks for robust representation of the protein space
0301 basic medicine
Protein Folding
Models, Statistical
Protein Conformation
Computational Biology
Proteins
Bayes Theorem
Evolution, Molecular
Automation
03 medical and health sciences
Sequence Analysis, Protein
Cluster Analysis
Databases, Protein
Algorithms
Probability
DOI:
10.1093/bioinformatics/btl130
Publication Date:
2006-04-05T00:24:30Z
AUTHORS (3)
ABSTRACT
Abstract
Motivation: A global view of the protein space is essential for functional and evolutionary analysis of proteins. In order to achieve this, a similarity network can be built using pairwise relationships among proteins. However, existing similarity networks employ a single similarity measure and therefore their utility depends highly on the quality of the selected measure. A more robust representation of the protein space can be realized if multiple sources of information are used.
Results: We propose a novel approach for analyzing multi-attribute similarity networks by combining random walks on graphs with Bayesian theory. A multi-attribute network is created by combining sequence and structure based similarity measures. For each attribute of the similarity network, one can compute a measure of affinity from a given protein to every other protein in the network using random walks. This process makes use of the implicit clustering information of the similarity network, and we show that it is superior to naive, local ranking methods. We then combine the computed affinities using a Bayesian framework. In particular, when we train a Bayesian model for automated classification of a novel protein, we achieve high classification accuracy and outperform single attribute networks. In addition, we demonstrate the effectiveness of our technique by comparison with a competing kernel-based information integration approach.
Availability: Source code is available upon request from the primary author.
Contact: orhan@cs.ucsb.edu
Supplementary Information: Supplementary data are available on Bioinformatic online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (38)
CITATIONS (15)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....