Density parameter estimation for finding clusters of homologous proteins—tracing actinobacterial pathogenicity lifestyles

Homology Protein superfamily Comparative Genomics
DOI: 10.1093/bioinformatics/bts653 Publication Date: 2012-11-10T04:30:01Z
ABSTRACT
Homology detection is a long-standing challenge in computational biology. To tackle this problem, typically all-versus-all BLAST results are coupled with data partitioning approaches resulting clusters of putative homologous proteins. One the main problems, however, has been widely neglected: all clustering tools need density parameter that adjusts number and size clusters. This crucial but hard to estimate without gold standard at hand. Developing standard, difficult time consuming task. Having reliable method for detecting proteins between huge set species would open opportunities better understanding genetic repertoire bacteria different lifestyles.Our contribution identifying suitable robust protein homology given standard. Therefore, we study core genome 89 actinobacteria. allows us incorporate background knowledge, i.e. assumption evolutionarily closely related should share comparably high conserved (emerging from phylum-specific housekeeping genes). We apply our strategy find genes/proteins specific certain actinobacterial lifestyles, types pathogenicity. The whole was performed transitivity clustering, as it only requires single intuitive shown be well applicable task sequence clustering. Note, presented generally does not depend on can easily adapted other approaches.All publicly available http://transclust.mmci.uni-saarland.de/actino_core/ or Supplementary Material article.roettger@mpi-inf.mpg.deSupplementary Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (30)
CITATIONS (13)