- Genomics and Phylogenetic Studies
- Scientific Computing and Data Management
- RNA and protein synthesis mechanisms
- T-cell and B-cell Immunology
- Immune Cell Function and Interaction
- Microbial Community Ecology and Physiology
- Microbial Metabolic Engineering and Bioproduction
- Immunotherapy and Immune Responses
- Machine Learning in Bioinformatics
- Advanced Proteomics Techniques and Applications
- Algorithms and Data Compression
- Bioinformatics and Genomic Networks
- Biomedical Text Mining and Ontologies
- Gene expression and cancer classification
- Cytomegalovirus and herpesvirus research
- Research Data Management Practices
- Innovative Microfluidic and Catalytic Techniques Innovation
- Environmental DNA in Biodiversity Studies
- Distributed and Parallel Computing Systems
- RNA modifications and cancer
- Bacteriophages and microbial interactions
- Glycosylation and Glycoproteins Research
- Genetics, Bioinformatics, and Biomedical Research
- DNA and Biological Computing
- Protein Structure and Dynamics
European Bioinformatics Institute
2006-2015
Wellcome Trust
2006-2015
Wellcome Sanger Institute
2007-2014
University of Manchester
2013
University of Bergen
2013
The Royal Free Hospital
2010-2012
Stanford University
2008-2012
University College London
2010-2012
Anthony Nolan
2010-2012
European Molecular Biology Laboratory
2007
Abstract Summary: The Clustal W and X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of algorithms future has allowed proper porting to latest versions Linux, Macintosh Windows operating systems. Availability: can be run on-line from EBI web server: http://www.ebi.ac.uk/tools/clustalw2. source code executables for Windows, Linux computers are available ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/ Contact:...
Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions outputs complete reimplementation framework, resulting flexible stable system that able use both multiprocessor machines and/or conventional...
The EMBL-EBI provides access to various mainstream sequence analysis applications. These include similarity search services such as BLAST, FASTA, InterProScan and multiple alignment tools ClustalW, T-Coffee MUSCLE. Through the services, users can databases EMBL-Bank UniProt, more than 2000 completed genomes proteomes. We present here a new framework aimed at both novice well expert that exposes novel methods of obtaining annotations visualizing results through one uniform consistent...
Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services search across available from EMBL-EBI explore network cross-references present in data (e.g. EB-eye), retrieve entry various formats specific fields dbfetch), tool services, for example, sequence similarity FASTA NCBI BLAST), multiple alignment Clustal Omega MUSCLE), pairwise protein functional InterProScan...
Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple alignment tools (https://www.ebi.ac.uk/Tools/msa/) Clustal Omega, MAFFT T-Coffee, other (https://www.ebi.ac.uk/Tools/pfa/) InterProScan. Through these users can databases ENA, UniProt Ensembl Genomes, utilising uniform web interface or...
It is 14 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Of these, 21 encode proteins immune system that are polymorphic. naming these alleles their quality control responsibility World Health Organization Nomenclature Committee for Factors System. Through work Informatics Group in collaboration...
It is 14 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Of these, 21 encode proteins immune system that are polymorphic. naming these alleles their quality control responsibility World Health Organization Nomenclature Committee for Factors System. Through work Informatics Group in collaboration...
The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study polymorphic genes in immune system. IPD project works with groups or nomenclature committees who provide and curate individual sections before they are submitted for online publication. stores all data databases. currently consists four databases: IPD-KIR, contains allelic sequences killer-cell immunoglobulin-like receptors, IPD-MHC, database major histocompatibility...
It is 12 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Many encode proteins immune system are polymorphic. naming these alleles their quality control responsibility WHO Nomenclature Committee for Factors System. Through work Informatics Group in collaboration European Bioinformatics Institute,...
Abstract Motivation: Advancing the search, publication and integration of bioinformatics tools resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM an operations (tool or workflow functions), types data identifiers, application domains formats. supports semantic annotation diverse entities as Web services, databases, programmatic libraries, standalone tools, interactive applications,...
The EMBL Nucleotide Sequence Database ( http://www.ebi.ac.uk/embl ), maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK, is a comprehensive collection of nucleotide sequences and annotation from available public sources. database part an international collaboration with DDBJ (Japan) GenBank (USA). Data are exchanged daily between collaborating institutes to achieve swift synchrony. Webin preferred tool for individual submissions sequences, including Third Party...
The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl) at the European Bioinformatics Institute, UK, offers a large and freely accessible collection of nucleotide sequences accompanying annotation. database is maintained in collaboration with DDBJ GenBank. Data are exchanged between collaborating databases on daily basis to achieve optimal synchrony. Webin preferred tool for individual submissions sequences, including Third Party Annotation, alignments bulk data. Automated...
The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study polymorphic genes in immune system. IPD project works with groups or nomenclature committees who provide and curate individual sections before they are submitted for online publication. stores all data databases. currently consists four databases: IPD-KIR, contains allelic sequences killer-cell immunoglobulin-like receptors, IPD-MHC, database major histocompatibility...
Dramatic increases in the throughput of nucleotide sequencing machines, and promise ever greater performance, have thrust bioinformatics into era petabyte-scale data sets. Sequence repositories, which provide feed for these sets worldwide computational infrastructure, are challenged by impact volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising EMBL Database Ensembl Trace Archive, has identified challenges storage, movement, analysis, interpretation...
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments major service enhancements in 2012 describe more detail two important areas development policy that are driven by ongoing growth sequencing technologies. First, data warehouse, a resource for which programmatic entry point to...
The Ensembl Trace Archive ( http://trace.ensembl.org/ ) and the EMBL Nucleotide Sequence Database http://www.ebi.ac.uk/embl/ ), known together as European Archive, continue to see growth in data volume diversity. Selected major developments of 2007 are presented briefly, along with submission retrieval information. In face increasing requirements for nucleotide trace, sequence annotation archiving, capture priority decisions have been taken at Archive. Priorities discussed terms how reliably...
The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB ArrayExpress. These are on Web Services (SOAP/REST) interfaces that allow users systematically analytical tools. From user's point of view, these provide same functionality browser-based forms. However, using frees...
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary nucleotide sequence archival resource, safeguarding open data access, engaging in worldwide collaborative exchange and integrating with the scientific publication process. ENA has made significant contributions to arena as an active proponent of extending traditional collaboration cover capillary next-generation sequencing information. We have continued co-develop metadata representation formats our...
The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers spectrum of types including raw reads, assembly and functional annotation. has faced dramatic growth in genome submission rates, volumes complexity datasets. This prompted broad reworking services, which we now reach end major programme work many enhancements have already been made available over year to components service. In this...
The European Bioinformatics Institute (EMBL-EBI—https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology biomedicine. Searching extracting knowledge these domains requires a fast scalable solution that addresses the requirements domain experts as well casual users. We present EBI Search engine, referred here 'EBI Search', an easy-to-use text search indexing system with powerful navigation retrieval capabilities. API integration analytical tools,...
The EB-eye is a fast and efficient search engine that provides easy uniform access to the biological data resources hosted at EMBL-EBI. Currently, users can information from more than 62 distinct datasets covering some 400 million entries. represented in include: nucleotide protein sequences both genomic proteomic levels, structures ranging chemicals macro-molecular complexes, gene-expression experiments, binary level molecular interactions as well reaction maps pathway models, functional...
Iterative similarity searches with PSI-BLAST position-specific score matrices (PSSMs) find many more homologs than single searches, but PSSMs can be contaminated when homologous alignments are extended into unrelated protein domains-homologous over-extension (HOE). PSI-Search combines an optimal Smith-Waterman local alignment sequence search, using SSEARCH, the profile construction strategy. An optional boundary-masking procedure, which prevents from being after they initially included,...
Web Services have gained a momentum as means for packaging existing data and computational resources in form that is amenable use composition by third party applications. The life science community certainly among the first adopters of Services. For example, "Taverna":http://www.mygrid.org.uk, workflow workbench popular within community, provides access to over 3500 thousands web services can be composed scientists constructing enacting their silico experiments. However, one main issues...
The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of databases and analysis tools that are key importance in bioinformatics. As well as providing Web interfaces these resources, Services available using SOAP REST protocols enable programmatic our resources allow their integration into other applications analytical workflows. This unit describes the various options typical researcher or bioinformatician who wishes use via interface programmatically programming languages.