- Genetics, Bioinformatics, and Biomedical Research
- Genomics and Phylogenetic Studies
- Machine Learning in Bioinformatics
- Biomedical Text Mining and Ontologies
- Scientific Computing and Data Management
- RNA and protein synthesis mechanisms
- Bioinformatics and Genomic Networks
- Advanced Proteomics Techniques and Applications
- Gene expression and cancer classification
- Biomedical and Engineering Education
- Research Data Management Practices
- Protein Structure and Dynamics
- Receptor Mechanisms and Signaling
- Semantic Web and Ontologies
- Enzyme Structure and Function
- Forensic and Genetic Research
- Microbial Community Ecology and Physiology
- Chromosomal and Genetic Variations
- Genomics and Chromatin Dynamics
- Cancer Genomics and Diagnostics
- Molecular Biology Techniques and Applications
- Microbial Metabolic Engineering and Bioproduction
- Axon Guidance and Neuronal Signaling
- Chemical Synthesis and Analysis
- Computational Drug Discovery Methods
University of Manchester
2015-2025
Radboud University Nijmegen
2010-2019
Radboud University Medical Center
2010-2019
European Bioinformatics Institute
2004-2014
Sainsbury Laboratory
2014
Netherlands Bioinformatics Centre
2013-2014
Instituto Gulbenkian de Ciência
2013-2014
Norwich Research Park
2014
Faculty (United Kingdom)
2014
Centro de Investigación del Cáncer
2011-2013
The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or 'signatures' representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY TIGRFAMs. Integration is performed manually approximately half of the total 58,000 signatures available in databases belong to an entry. Recently, we have started also display remaining un-integrated via our web...
InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and predict the presence of important domains sites. InterProScan underlying software that allows both nucleic acid be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with associated software, including addition two new databases (SFLD CDD), functionality include residue-level annotation...
The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains sites. Here, we report recent developments with (version 70.0) its associated software, including an 18% growth in size terms on new entries, updates to content, inclusion additional entry type, refined modelling discontinuous domains, development a programmatic interface website. These extend enrich information provided by InterPro,...
The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and predict the presence of important domains sites. Central are predictive models, known as signatures, from range different family databases have biological focuses use methodological approaches domains. integrates these capitalizing on respective strengths individual databases, produce powerful classification resource. Here, we report status it...
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, makes it freely available to the public via Web-based interfaces services. Central are diagnostic models, known as signatures, against which sequences can be searched determine their potential function. has utility in large-scale analysis of whole genomes meta-genomes, well characterizing individual sequences. Herein we give an overview new...
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF SUPERFAMILY. Signatures are manually into InterPro entries that curated provide biological information. Annotation is provided in abstract, Gene Ontology mapping links specialized New features include extended match views, taxonomic range information 3D structure...
InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D PANTHER. The latter two new member databases have been since last publication in this journal. There several developments InterPro, including additional reading field, database links, extensions to web interface match XML files. has always provided matches UniProtKB proteins on...
The PRINTS database houses a collection of protein fingerprints. These may be used to assign uncharacterised sequences known families and hence infer tentative functions. September 2002 release (version 36.0) includes 1800 fingerprints, encoding ∼11 000 motifs, covering range globular membrane proteins, modular polypeptides so on. In addition its continued steady growth, we report here the development an automatic supplement, prePRINTS, designed increase coverage resource reduce some manual...
Recommendations about structuring proteomic biomarker studies should increase the probability that such markers will be clinically useful.
InterPro is a new integrated documentation resource for protein families, domains and functional sites, developed initially as means of rationalising the complementary efforts PROSITE, PRINTS, Pfam ProDom database projects.Merged annotations from PROSITE form core. Each combined entry includes descriptions literature references, links are made back to relevant parent database(s), allowing users see at glance whether particular family or domain has associated patterns, profiles, fingerprints,...
Abstract The lipocalins and fatty acid‐binding proteins (FABPs) are two recently identified protein families that both function by binding small hydrophobic molecules. We have sought to clarify relationships within between these groups through an analysis of structure sequence. Within a similar overall folding pattern, we find large parts the lipocalin FABP structures be quantitatively equivalent. three largest structurally conserved regions common core correspond characteristic sequence...
Recently we reported the design of a discriminating fingerprint for rhodopsin-like G-protein-coupled receptors (GPCRs). The encodes seven putative membrane-spanning motifs and was potently diagnostic all GPCRs (52 in all) version 8.1 OWL composite sequence database, readily distinguishing them from other integral membrane proteins. With 3-fold increase size OWL, has been updated now finds 332 that match motifs. situation, however, grown complexity: 61 sequences make imperfect matches with...
The PRINTS database houses a collection of protein family fingerprints. These are groups motifs that together diagnostically more potent than single by virtue the biological context afforded matching motif neighbours. Around 1200 fingerprints have now been created and stored in database. September 1999 release (version 24.0) encodes ~7200 motifs, covering range globular membrane proteins, modular polypeptides so on. In addition to its continued steady growth, we report here several major...
Regions of protein sequences with biased amino acid composition (so-called Low-Complexity (LCRs)) are abundant in the universe. A number studies have revealed that i) these regions show significant divergence across families; ii) genetic mechanisms from which they arise lends them remarkable degrees compositional plasticity. They therefore proved difficult to compare using conventional sequence analysis techniques, and functions remain be elucidated for most them. Here we undertake a...
The WHAT IF molecular-modelling and drug design program is widely distributed in the world of protein structure bioinformatics. Although originally designed as an interactive application, its highly modular inbuilt control language have recently enabled deployment a collection programmatically accessible web services. We report here IF-based bioinformatics services: these relate to quality, use symmetry crystal structures, correction optimization, adding hydrogens optimizing hydrogen bonds...
The PRINTS database, now in its 21st year, houses a collection of diagnostic protein family 'fingerprints'. Fingerprints are groups conserved motifs, evident multiple sequence alignments, whose unique inter-relationships provide distinctive signatures for particular families and structural/functional domains. As such, they may be used to assign uncharacterized sequences known families, hence infer tentative functional, structural and/or evolutionary relationships. February 2012 release...
While large numbers of proteomic biomarkers have been described, they are generally not implemented in medical practice. We investigated the reasons for this shortcoming, focusing on hurdles downstream biomarker verification, and describe major obstacles possible solutions to ease valid implementation. Some problems lie suboptimal discovery validation, especially lack validated platforms with well-described performance characteristics support qualification. These issues acknowledged being...
Bioinformatics is now intrinsic to life science research, but the past decade has witnessed a continuing deficiency in this essential expertise. Basic data stewardship still taught relatively rarely education programmes, creating chasm between theory and practice, fuelling demand for bioinformatics training across all educational levels career roles. Concerned by this, surveys have been conducted recent years monitor computational needs worldwide. This article briefly reviews principal...
The consortium members are situated at different research centres around the world.
The PRINTS database of protein ‘fingerprints’ is described. Fingerprints comprise sets motifs excised from conserved regions sequence alignments, their diagnostic power or potency being refined by iterative scanning (in this case the OWL composite database). Generally, do not overlap, but are separated along a sequence, though they may be contiguous in 3-D space. use groups independent, linearly spatially separate allows particular folds and functionalities to characterized more flexibly...