- Bioinformatics and Genomic Networks
- Genomics and Phylogenetic Studies
- Biomedical Text Mining and Ontologies
- Machine Learning in Bioinformatics
- Advanced Proteomics Techniques and Applications
- Genetics, Bioinformatics, and Biomedical Research
- RNA and protein synthesis mechanisms
- Scientific Computing and Data Management
- Software Testing and Debugging Techniques
- Advanced Biosensing Techniques and Applications
- Metabolomics and Mass Spectrometry Studies
- Topic Modeling
- Enzyme Structure and Function
- Software Engineering Research
- Microbial Metabolic Engineering and Bioproduction
- Gene expression and cancer classification
- Natural Language Processing Techniques
- Microbial Natural Products and Biosynthesis
Wellcome Sanger Institute
2008-2014
European Bioinformatics Institute
2005-2014
University College London
2008-2011
Université Claude Bernard Lyon 1
2008-2011
Georgetown University
2008-2011
University Hospital Heidelberg
2008-2011
SIB Swiss Institute of Bioinformatics
2008-2011
Heidelberg University
2008-2011
University of Bristol
2011
University of London
2011
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions outputs complete reimplementation framework, resulting flexible stable system that able use both multiprocessor machines and/or conventional...
The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or 'signatures' representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY TIGRFAMs. Integration is performed manually approximately half of the total 58,000 signatures available in databases belong to an entry. Recently, we have started also display remaining un-integrated via our web...
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, makes it freely available to the public via Web-based interfaces services. Central are diagnostic models, known as signatures, against which sequences can be searched determine their potential function. has utility in large-scale analysis of whole genomes meta-genomes, well characterizing individual sequences. Herein we give an overview new...
Molecular interaction Information is a key resource in modern biomedical research. Publicly available data have previously been provided broad array of diverse formats, making access to this very difficult. The publication and wide implementation the Human Proteome Organisation Proteomics Standards Initiative Interactions (HUPO PSI-MI) format 2004 was major step towards establishment single, unified by which molecular interactions should be presented, but focused purely on protein-protein...
The PRIDE ( http://www.ebi.ac.uk/pride ) database of protein and peptide identifications was previously described in the NAR Database Special Edition 2006. Since this publication, volume public data relational has increased by more than an order magnitude. Several significant datasets have been added, including processed mass spectra generated HUPO Brain Proteome Project Liver Project. software development team made several changes additions to user interface tool set associated with PRIDE....
Abstract Summary: Dasty2 is a highly interactive web client integrating protein sequence annotations from currently more than 40 sources, using the distributed annotation system (DAS). Availability: an open source tool freely available under terms of Apache License 2.0, publicly at http://www.ebi.ac.uk/dasty/ Contact: hhe@ebi.ac.uk
In this study, we present two freely available and complementary Distributed Annotation System (DAS) resources: a DAS reference server that provides up-to-date sequence annotation from UniProt, with additional feature links database cross-references InterPro client implemented using Java Macromedia Flash is optimized for the display of protein features.
A large number of diverse, complex, and distributed data resources are currently available in the Bioinformatics domain. The pace discovery diversity information means that centralised reference databases like UniProt Ensembl cannot integrate all potentially relevant sources. From a user perspective however, access to concerning specific query is essential. Distributed Annotation System (DAS) defines communication protocol exchange annotations on genomic protein sequences; this...