- Genomics and Phylogenetic Studies
- Scientific Computing and Data Management
- RNA and protein synthesis mechanisms
- Genetics, Bioinformatics, and Biomedical Research
- Advanced Proteomics Techniques and Applications
- Machine Learning in Bioinformatics
- Bioinformatics and Genomic Networks
- Biomedical Text Mining and Ontologies
- Gene expression and cancer classification
- Microbial Community Ecology and Physiology
- Research Data Management Practices
- Enzyme Structure and Function
- Bacteriophages and microbial interactions
- Immune Cell Function and Interaction
- Microbial Metabolic Engineering and Bioproduction
- T-cell and B-cell Immunology
- Metaheuristic Optimization Algorithms Research
- Advanced Multi-Objective Optimization Algorithms
- Algorithms and Data Compression
- Oral microbiology and periodontitis research
- Topic Modeling
- Protein Structure and Dynamics
- Immunotherapy and Immune Responses
- Evolutionary Algorithms and Applications
- Environmental DNA in Biodiversity Studies
Wellcome Trust
2012-2024
European Bioinformatics Institute
2014-2024
Pontificia Universidad Católica de Chile
2022-2024
Universidade do Porto
2022
Universidad de Aysén
2022
University of Chile
2022
University of York
2019
Universidad San Francisco de Quito
2019
Universidade de São Paulo
2018
Universidad Michoacana de San Nicolás de Hidalgo
2010-2016
Abstract Summary: The Clustal W and X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of algorithms future has allowed proper porting to latest versions Linux, Macintosh Windows operating systems. Availability: can be run on-line from EBI web server: http://www.ebi.ac.uk/tools/clustalw2. source code executables for Windows, Linux computers are available ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/ Contact:...
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions outputs complete reimplementation framework, resulting flexible stable system that able use both multiprocessor machines and/or conventional...
To provide the scientific community with a single, centralized, authoritative resource for protein sequences and functional information, Swiss-Prot, TrEMBL PIR database activities have united to form Universal Protein Knowledgebase (UniProt) consortium. Our mission is comprehensive, fully classified, richly accurately annotated sequence knowledgebase, extensive cross-references query interfaces. The central will two sections, corresponding familiar Swiss-Prot (fully manually curated entries)...
The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set protein sequences annotated functional information. In this article, we describe significant updates that have made over last two years resource. number in UniProtKB has risen approximately 190 million, despite continued work reduce sequence redundancy at proteome level. We adopted new methods assessing completeness quality. continue extract detailed annotations from...
The EMBL-EBI provides free access to popular bioinformatics sequence analysis applications as well a full-featured text search engine with powerful cross-referencing and data retrieval capabilities. Access these services is provided via user-friendly web interfaces established RESTful SOAP Web Services APIs (https://www.ebi.ac.uk/seqdb/confluence/display/JDSAT/EMBL-EBI+Web+Services+APIs+-+Data+Retrieval). Both systems have been developed the same core principles that allow them integrate an...
InterProScan [E. M. Zdobnov and R. Apweiler (2001) Bioinformatics, 17, 847-848] is a tool that combines different protein signature recognition methods from the InterPro [N. J. Mulder, Apweiler, T. K. Attwood, A. Bairoch, Bateman, D. Binns, P. Bradley, Bork, Bucher, L. Cerutti et al. (2005) Nucleic Acids Res., 33, D201-D205] consortium member databases into one resource. At time of writing there are 10 distinct publicly available in application. Protein as well DNA sequences can be analysed....
The InterPro database (http://www.ebi.ac.uk/interpro/) integrates together predictive models or 'signatures' representing protein domains, families and functional sites from multiple, diverse source databases: Gene3D, PANTHER, Pfam, PIRSF, PRINTS, ProDom, PROSITE, SMART, SUPERFAMILY TIGRFAMs. Integration is performed manually approximately half of the total 58,000 signatures available in databases belong to an entry. Recently, we have started also display remaining un-integrated via our web...
Abstract The EMBL-EBI search and sequence analysis tools frameworks provide integrated access to EMBL-EBI’s data resources core bioinformatics analytical tools. EBI Search (https://www.ebi.ac.uk/ebisearch) provides a full-text engine across nearly 5 billion entries, while the Job Dispatcher framework (https://www.ebi.ac.uk/services) enables scientific community perform diverse range of using popular applications. Both allow users interact through user-friendly web applications, as well via...
The HMMER webserver [http://www.ebi.ac.uk/Tools/hmmer] is a free-to-use service which provides fast searches against widely used sequence databases and profile hidden Markov model (HMM) libraries using the software suite (http://hmmer.org). results of search may be summarized in number ways, allowing users to view filter significant hits by domain architecture or taxonomy. For large scale usage, we provide an application programmatic interface (API) has been expanded scope, such that all...
The EMBL-EBI provides access to various mainstream sequence analysis applications. These include similarity search services such as BLAST, FASTA, InterProScan and multiple alignment tools ClustalW, T-Coffee MUSCLE. Through the services, users can databases EMBL-Bank UniProt, more than 2000 completed genomes proteomes. We present here a new framework aimed at both novice well expert that exposes novel methods of obtaining annotations visualizing results through one uniform consistent...
Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services search across available from EMBL-EBI explore network cross-references present in data (e.g. EB-eye), retrieve entry various formats specific fields dbfetch), tool services, for example, sequence similarity FASTA NCBI BLAST), multiple alignment Clustal Omega MUSCLE), pairwise protein functional InterProScan...
InterPro (http://www.ebi.ac.uk/interpro/) is a freely available database used to classify protein sequences into families and predict the presence of important domains sites. InterProScan underlying software that allows both nucleic acid be searched against InterPro's predictive models, which are provided by its member databases. Here, we report recent developments with associated software, including addition two new databases (SFLD CDD), functionality include residue-level annotation...
The InterPro database (http://www.ebi.ac.uk/interpro/) classifies protein sequences into families and predicts the presence of functionally important domains sites. Here, we report recent developments with (version 70.0) its associated software, including an 18% growth in size terms on new entries, updates to content, inclusion additional entry type, refined modelling discontinuous domains, development a programmatic interface website. These extend enrich information provided by InterPro,...
The InterPro database (http://www.ebi.ac.uk/interpro/) is a freely available resource that can be used to classify sequences into protein families and predict the presence of important domains sites. Central are predictive models, known as signatures, from range different family databases have biological focuses use methodological approaches domains. integrates these capitalizing on respective strengths individual databases, produce powerful classification resource. Here, we report status it...
InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, makes it freely available to the public via Web-based interfaces services. Central are diagnostic models, known as signatures, against which sequences can be searched determine their potential function. has utility in large-scale analysis of whole genomes meta-genomes, well characterizing individual sequences. Herein we give an overview new...
Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple alignment tools (https://www.ebi.ac.uk/Tools/msa/) Clustal Omega, MAFFT T-Coffee, other (https://www.ebi.ac.uk/Tools/pfa/) InterProScan. Through these users can databases ENA, UniProt Ensembl Genomes, utilising uniform web interface or...
It is 14 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Of these, 21 encode proteins immune system that are polymorphic. naming these alleles their quality control responsibility World Health Organization Nomenclature Committee for Factors System. Through work Informatics Group in collaboration...
InterPro, an integrated documentation resource of protein families, domains and functional sites, was created to integrate the major signature databases. Currently, it includes PROSITE, Pfam, PRINTS, ProDom, SMART, TIGRFAMs, PIRSF SUPERFAMILY. Signatures are manually into InterPro entries that curated provide biological information. Annotation is provided in abstract, Gene Ontology mapping links specialized New features include extended match views, taxonomic range information 3D structure...
It is 14 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Of these, 21 encode proteins immune system that are polymorphic. naming these alleles their quality control responsibility World Health Organization Nomenclature Committee for Factors System. Through work Informatics Group in collaboration...
InterPro is an integrated resource for protein families, domains and functional sites, which integrates the following signature databases: PROSITE, PRINTS, ProDom, Pfam, SMART, TIGRFAMs, PIRSF, SUPERFAMILY, Gene3D PANTHER. The latter two new member databases have been since last publication in this journal. There several developments InterPro, including additional reading field, database links, extensions to web interface match XML files. has always provided matches UniProtKB proteins on...
The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study polymorphic genes in immune system. IPD project works with groups or nomenclature committees who provide and curate individual sections before they are submitted for online publication. stores all data databases. currently consists four databases: IPD-KIR, contains allelic sequences killer-cell immunoglobulin-like receptors, IPD-MHC, database major histocompatibility...
Abstract The EMBL-EBI Job Dispatcher sequence analysis tools framework (https://www.ebi.ac.uk/jdispatcher) enables the scientific community to perform a diverse range of analyses using popular bioinformatics applications. Free access and required datasets is provided through user-friendly web applications, as well via RESTful SOAP-based APIs. These are integrated into resources such UniProt, InterPro, ENA Ensembl Genomes. This paper overviews recent improvements Dispatcher, including its...
It is 12 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Many encode proteins immune system are polymorphic. naming these alleles their quality control responsibility WHO Nomenclature Committee for Factors System. Through work Informatics Group in collaboration European Bioinformatics Institute,...