Hamish McWilliam

ORCID: 0000-0003-1769-5032
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Scientific Computing and Data Management
  • RNA and protein synthesis mechanisms
  • T-cell and B-cell Immunology
  • Immune Cell Function and Interaction
  • Microbial Community Ecology and Physiology
  • Microbial Metabolic Engineering and Bioproduction
  • Immunotherapy and Immune Responses
  • Machine Learning in Bioinformatics
  • Advanced Proteomics Techniques and Applications
  • Algorithms and Data Compression
  • Bioinformatics and Genomic Networks
  • Biomedical Text Mining and Ontologies
  • Gene expression and cancer classification
  • Cytomegalovirus and herpesvirus research
  • Research Data Management Practices
  • Innovative Microfluidic and Catalytic Techniques Innovation
  • Environmental DNA in Biodiversity Studies
  • Distributed and Parallel Computing Systems
  • RNA modifications and cancer
  • Bacteriophages and microbial interactions
  • Glycosylation and Glycoproteins Research
  • Genetics, Bioinformatics, and Biomedical Research
  • DNA and Biological Computing
  • Protein Structure and Dynamics

European Bioinformatics Institute
2006-2015

Wellcome Trust
2006-2015

Wellcome Sanger Institute
2007-2014

University of Manchester
2013

University of Bergen
2013

The Royal Free Hospital
2010-2012

Stanford University
2008-2012

University College London
2010-2012

Anthony Nolan
2010-2012

European Molecular Biology Laboratory
2007

Abstract Summary: The Clustal W and X multiple sequence alignment programs have been completely rewritten in C++. This will facilitate the further development of algorithms future has allowed proper porting to latest versions Linux, Macintosh Windows operating systems. Availability: can be run on-line from EBI web server: http://www.ebi.ac.uk/tools/clustalw2. source code executables for Windows, Linux computers are available ftp site ftp://ftp.ebi.ac.uk/pub/software/clustalw2/ Contact:...

10.1093/bioinformatics/btm404 article EN Bioinformatics 2007-09-10

Motivation: Robust large-scale sequence analysis is a major challenge in modern genomic science, where biologists are frequently trying to characterize many millions of sequences. Here, we describe new Java-based architecture for the widely used protein function prediction software package InterProScan. Developments include improvements and additions outputs complete reimplementation framework, resulting flexible stable system that able use both multiprocessor machines and/or conventional...

10.1093/bioinformatics/btu031 article EN cc-by Bioinformatics 2014-01-23

The EMBL-EBI provides access to various mainstream sequence analysis applications. These include similarity search services such as BLAST, FASTA, InterProScan and multiple alignment tools ClustalW, T-Coffee MUSCLE. Through the services, users can databases EMBL-Bank UniProt, more than 2000 completed genomes proteomes. We present here a new framework aimed at both novice well expert that exposes novel methods of obtaining annotations visualizing results through one uniform consistent...

10.1093/nar/gkq313 article EN Nucleic Acids Research 2010-05-03

Since 2004 the European Bioinformatics Institute (EMBL-EBI) has provided access to a wide range of databases and analysis tools via Web Services interfaces. This comprises services search across available from EMBL-EBI explore network cross-references present in data (e.g. EB-eye), retrieve entry various formats specific fields dbfetch), tool services, for example, sequence similarity FASTA NCBI BLAST), multiple alignment Clustal Omega MUSCLE), pairwise protein functional InterProScan...

10.1093/nar/gkt376 article EN cc-by Nucleic Acids Research 2013-05-11

Since 2009 the EMBL-EBI Job Dispatcher framework has provided free access to a range of mainstream sequence analysis applications. These include similarity search services (https://www.ebi.ac.uk/Tools/sss/) such as BLAST, FASTA and PSI-Search, multiple alignment tools (https://www.ebi.ac.uk/Tools/msa/) Clustal Omega, MAFFT T-Coffee, other (https://www.ebi.ac.uk/Tools/pfa/) InterProScan. Through these users can databases ENA, UniProt Ensembl Genomes, utilising uniform web interface or...

10.1093/nar/gkv279 article EN cc-by-nc Nucleic Acids Research 2015-04-06

It is 14 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Of these, 21 encode proteins immune system that are polymorphic. naming these alleles their quality control responsibility World Health Organization Nomenclature Committee for Factors System. Through work Informatics Group in collaboration...

10.1093/nar/gkn662 article EN cc-by-nc Nucleic Acids Research 2008-10-07

It is 14 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Of these, 21 encode proteins immune system that are polymorphic. naming these alleles their quality control responsibility World Health Organization Nomenclature Committee for Factors System. Through work Informatics Group in collaboration...

10.1093/nar/gks949 article EN cc-by-nc Nucleic Acids Research 2012-10-17

The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study polymorphic genes in immune system. IPD project works with groups or nomenclature committees who provide and curate individual sections before they are submitted for online publication. stores all data databases. currently consists four databases: IPD-KIR, contains allelic sequences killer-cell immunoglobulin-like receptors, IPD-MHC, database major histocompatibility...

10.1093/nar/gkp879 article EN cc-by-nc Nucleic Acids Research 2009-10-29

It is 12 years since the IMGT/HLA database was first released, providing HLA community with a searchable repository of highly curated sequences. The complex located within 6p21.3 region human chromosome 6 and contains more than 220 genes diverse function. Many encode proteins immune system are polymorphic. naming these alleles their quality control responsibility WHO Nomenclature Committee for Factors System. Through work Informatics Group in collaboration European Bioinformatics Institute,...

10.1093/nar/gkq998 article EN cc-by-nc Nucleic Acids Research 2010-11-11

Abstract Motivation: Advancing the search, publication and integration of bioinformatics tools resources demands consistent machine-understandable descriptions. A comprehensive ontology allowing such descriptions is therefore required. Results: EDAM an operations (tool or workflow functions), types data identifiers, application domains formats. supports semantic annotation diverse entities as Web services, databases, programmatic libraries, standalone tools, interactive applications,...

10.1093/bioinformatics/btt113 article EN cc-by Bioinformatics 2013-03-11

The EMBL Nucleotide Sequence Database ( http://www.ebi.ac.uk/embl ), maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK, is a comprehensive collection of nucleotide sequences and annotation from available public sources. database part an international collaboration with DDBJ (Japan) GenBank (USA). Data are exchanged daily between collaborating institutes to achieve swift synchrony. Webin preferred tool for individual submissions sequences, including Third Party...

10.1093/nar/gki098 article EN other-oa Nucleic Acids Research 2004-12-17

The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl) at the European Bioinformatics Institute, UK, offers a large and freely accessible collection of nucleotide sequences accompanying annotation. database is maintained in collaboration with DDBJ GenBank. Data are exchanged between collaborating databases on daily basis to achieve optimal synchrony. Webin preferred tool for individual submissions sequences, including Third Party Annotation, alignments bulk data. Automated...

10.1093/nar/gkl913 article EN cc-by-nc Nucleic Acids Research 2006-12-06

The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study polymorphic genes in immune system. IPD project works with groups or nomenclature committees who provide and curate individual sections before they are submitted for online publication. stores all data databases. currently consists four databases: IPD-KIR, contains allelic sequences killer-cell immunoglobulin-like receptors, IPD-MHC, database major histocompatibility...

10.1093/nar/gks1140 article EN cc-by-nc Nucleic Acids Research 2012-11-23

Dramatic increases in the throughput of nucleotide sequencing machines, and promise ever greater performance, have thrust bioinformatics into era petabyte-scale data sets. Sequence repositories, which provide feed for these sets worldwide computational infrastructure, are challenged by impact volumes. The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/embl), comprising EMBL Database Ensembl Trace Archive, has identified challenges storage, movement, analysis, interpretation...

10.1093/nar/gkn765 article EN cc-by-nc Nucleic Acids Research 2008-11-01

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena/) collects, maintains and presents comprehensive nucleic acid sequence related information as part of the permanent public scientific record. Here, we provide brief updates on ENA content developments major service enhancements in 2012 describe more detail two important areas development policy that are driven by ongoing growth sequencing technologies. First, data warehouse, a resource for which programmatic entry point to...

10.1093/nar/gks1175 article EN cc-by-nc Nucleic Acids Research 2012-11-29

The Ensembl Trace Archive ( http://trace.ensembl.org/ ) and the EMBL Nucleotide Sequence Database http://www.ebi.ac.uk/embl/ ), known together as European Archive, continue to see growth in data volume diversity. Selected major developments of 2007 are presented briefly, along with submission retrieval information. In face increasing requirements for nucleotide trace, sequence annotation archiving, capture priority decisions have been taken at Archive. Priorities discussed terms how reliably...

10.1093/nar/gkm1018 article EN cc-by-nc Nucleic Acids Research 2007-11-27

The European Bioinformatics Institute (EMBL-EBI) has been providing access to mainstream databases and tools in bioinformatics since 1997. In addition the traditional web form based interfaces, APIs exist for core data resources such as EMBL-Bank, Ensembl, UniProt, InterPro, PDB ArrayExpress. These are on Web Services (SOAP/REST) interfaces that allow users systematically analytical tools. From user's point of view, these provide same functionality browser-based forms. However, using frees...

10.1093/nar/gkp302 article EN cc-by-nc Nucleic Acids Research 2009-05-12

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is Europe's primary nucleotide sequence archival resource, safeguarding open data access, engaging in worldwide collaborative exchange and integrating with the scientific publication process. ENA has made significant contributions to arena as an active proponent of extending traditional collaboration cover capillary next-generation sequencing information. We have continued co-develop metadata representation formats our...

10.1093/nar/gkp998 article EN cc-by-nc Nucleic Acids Research 2009-11-10

The European Nucleotide Archive (ENA; http://www.ebi.ac.uk/ena) is a repository for the world public domain nucleotide sequence data output. ENA content covers spectrum of types including raw reads, assembly and functional annotation. has faced dramatic growth in genome submission rates, volumes complexity datasets. This prompted broad reworking services, which we now reach end major programme work many enhancements have already been made available over year to components service. In this...

10.1093/nar/gkt1082 article EN cc-by Nucleic Acids Research 2013-11-08

The European Bioinformatics Institute (EMBL-EBI—https://www.ebi.ac.uk) provides free and unrestricted access to data across all major areas of biology biomedicine. Searching extracting knowledge these domains requires a fast scalable solution that addresses the requirements domain experts as well casual users. We present EBI Search engine, referred here 'EBI Search', an easy-to-use text search indexing system with powerful navigation retrieval capabilities. API integration analytical tools,...

10.1093/nar/gkv316 article EN cc-by-nc Nucleic Acids Research 2015-04-08

The EB-eye is a fast and efficient search engine that provides easy uniform access to the biological data resources hosted at EMBL-EBI. Currently, users can information from more than 62 distinct datasets covering some 400 million entries. represented in include: nucleotide protein sequences both genomic proteomic levels, structures ranging chemicals macro-molecular complexes, gene-expression experiments, binary level molecular interactions as well reaction maps pathway models, functional...

10.1093/bib/bbp065 article EN Briefings in Bioinformatics 2010-02-11

Iterative similarity searches with PSI-BLAST position-specific score matrices (PSSMs) find many more homologs than single searches, but PSSMs can be contaminated when homologous alignments are extended into unrelated protein domains-homologous over-extension (HOE). PSI-Search combines an optimal Smith-Waterman local alignment sequence search, using SSEARCH, the profile construction strategy. An optional boundary-masking procedure, which prevents from being after they initially included,...

10.1093/bioinformatics/bts240 article EN cc-by-nc Bioinformatics 2012-04-25

Web Services have gained a momentum as means for packaging existing data and computational resources in form that is amenable use composition by third party applications. The life science community certainly among the first adopters of Services. For example, "Taverna":http://www.mygrid.org.uk, workflow workbench popular within community, provides access to over 3500 thousands web services can be composed scientists constructing enacting their silico experiments. However, one main issues...

10.1038/npre.2009.3132 preprint EN Nature Precedings 2009-04-22

The European Bioinformatics Institute (EMBL-EBI) provides access to a wide range of databases and analysis tools that are key importance in bioinformatics. As well as providing Web interfaces these resources, Services available using SOAP REST protocols enable programmatic our resources allow their integration into other applications analytical workflows. This unit describes the various options typical researcher or bioinformatician who wishes use via interface programmatically programming languages.

10.1002/0471250953.bi0312s48 article EN Current Protocols in Bioinformatics 2014-12-01
Coming Soon ...