Tim Hubbard
- Genomics and Phylogenetic Studies
- RNA and protein synthesis mechanisms
- Protein Structure and Dynamics
- Genomics and Chromatin Dynamics
- Enzyme Structure and Function
- RNA modifications and cancer
- Machine Learning in Bioinformatics
- Cancer Genomics and Diagnostics
- Chromosomal and Genetic Variations
- Genomics and Rare Diseases
- Genomic variations and chromosomal abnormalities
- RNA Research and Splicing
- Genetics, Bioinformatics, and Biomedical Research
- Epigenetics and DNA Methylation
- Bioinformatics and Genomic Networks
- Gene expression and cancer classification
- Biomedical Text Mining and Ontologies
- Genetic factors in colorectal cancer
- CRISPR and Genetic Engineering
- Genetic Neurodegenerative Diseases
- Genetic Mapping and Diversity in Plants and Animals
- Genetics and Neurodevelopmental Disorders
- Microbial Metabolic Engineering and Bioproduction
- Cancer-related molecular mechanisms research
- Molecular Biology Techniques and Applications
Genomics England
2015-2025
University of Minnesota
2010-2025
King's College London
2014-2024
Guy's Hospital
2015-2024
University of Minnesota System
2024
Queen Mary University of London
2016-2022
University of Dundee
2022
Jackson Laboratory
2021
Wellcome Sanger Institute
2008-2019
Croydon University Hospital
2015-2018
The human genome holds an extraordinary trove of information about development, physiology, medicine and evolution. Here we report the results international collaboration to produce make freely available a draft sequence genome. We also present initial analysis data, describing some insights that can be gleaned from sequence.
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue these is not yet available their characteristic localizations also poorly understood. Because RNA represents direct output genetic information encoded by genomes a significant proportion cell's regulatory capabilities focused on its synthesis, processing, transport, modification translation, generation such crucial for...
The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical experimental approaches to investigate these genes been hampered by the lack comprehensive lncRNA annotation. Here, we present analyze most complete annotation date, produced GENCODE consortium within framework ENCODE project comprising 9277 manually annotated producing 14,880 transcripts. Our analyses...
The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since first public release this annotation data set, few new protein-coding loci have been added, yet number alternative splicing transcripts annotated has steadily increased. 7 contains 20,687 9640 long noncoding RNA 33,977 coding not represented UCSC genes RefSeq. It also most comprehensive (lncRNA) publicly available...
The accurate identification and description of the genes in human mouse genomes is a fundamental requirement for high quality analysis data informing both genome biology clinical genomics. Over last 15 years, GENCODE consortium has been producing reference gene annotations to provide this foundational resource. includes experimental computational groups who work together improve extend annotation. Specifically, we generate primary data, create bioinformatics tools support expert manual...
The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is comprehensive source stable automatic annotation human genome sequence, with confirmed gene predictions that have been integrated external data sources, and available as either an interactive web site or flat files. also open software engineering develop portable system able handle very genomes associated requirements from sequence...
Ensembl (http://www.ensembl.org) creates tools and data resources to facilitate genomic analysis in chordate species with an emphasis on human, major vertebrate model organisms farm animals. Over the past year we have increased number of that support 77 expanded our genome browser a new scrollable overview improved variation phenotype views. We also report updates core datasets improvements gene homology relationships from addition species. Our REST service has been extended additional for...
The contribution of rare and low-frequency variants to human traits is largely unexplored. Here we describe insights from sequencing whole genomes (low read depth, 7×) or exomes (high 80×) nearly 10,000 individuals population-based disease collections. In extensively phenotyped cohorts characterize over 24 million novel sequence variants, generate a highly accurate imputation reference panel identify alleles associated with levels triglycerides (APOB), adiponectin (ADIPOQ) low-density...
Abstract The GENCODE project annotates human and mouse genes transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology clinical genomics. annotation processes make use of primary bioinformatic tools analysis generated both within the consortium externally to support creation transcript structures determination their function. Here, we present improvements our infrastructure, bioinformatics tools, analysis, advances they in...
The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing update since 2009; it reflects resolution roughly 1000 issues encompasses modifications ranging from thousands single base changes to megabase-scale path reorganizations, gap closures, localization previously orphaned sequences. We developed new approach sequence generation for targeted updates used data mapping technologies haplotype...
The Structural Classification of Proteins (SCOP) database is a comprehensive ordering all proteins known structure, according to their evolutionary and structural relationships. SCOP hierarchy comprises the following levels: Species, Protein, Family, Superfamily, Fold Class. While keeping original classification scheme intact, we have changed production in order cope with rapid growth new data facilitate discovery protein We describe ongoing developments features implemented SCOP. A update...
The Structural Classification of Proteins (SCOP) database provides a detailed and comprehensive description the relationships all known proteins structures. classification is on hierarchical levels: first two levels, family superfamily, describe near far evolutionary relationships; third, fold, describes geometrical relationships. distinction between those that arise from physics chemistry feature unique to this database, so far. SCOP also for each structure links atomic co-ordinates, images...
The Structural Classification of Proteins (SCOP) database is a comprehensive ordering all proteins known structure, according to their evolutionary and structural relationships. Protein domains in SCOP are hierarchically classified into families, superfamilies, folds classes. continual accumulation sequence data allows more rigorous analysis provides important information for understanding the protein world its repertoire. participates project that aims rationalize integrate on held several...
The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets all supported species; large-scale whole multiple species alignments across vertebrates clade-specific eutherian mammals, primates, birds fish; variation data 17 regulation annotations based ENCODE other sets. are accessible through the browser at http://www.ensembl.org tools...
The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human data as well key model organisms such mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) Tasmanian devil (Sarcophilus harrisii) bringing total number of supported to 61 release 64 (September 2011). Of these, 55 appear main website six are provided preview site (Pre!Ensembl; http://pre.ensembl.org)...
We evaluated 25 protocol variants of 14 independent computational methods for exon identification, transcript reconstruction and expression-level quantification from RNA-seq data. Our results show that most algorithms are able to identify discrete components with high success rates but assembly complete isoform structures poses a major challenge even when all constituent elements identified. Expression-level estimates also varied widely across methods, based on similar models. Consequently,...
The Ensembl project (http://www.ensembl.org) is a comprehensive genome information system featuring an integrated set of annotation, databases, and other for chordate, selected model organism disease vector genomes. As release 51 (November 2008), fully supports 45 species, three additional species have preliminary support. New in the past year include orangutan six low coverage mammalian Major additions improvements to since our previous report major redesign website; generation multiple...
The Ensembl (http://www.ensembl.org/) project provides a comprehensive and integrated source of annotation chordate genome sequences. Over the past year number genomes available from has increased 15 to 33, with addition sites for mammalian elephant, rabbit, armadillo, tenrec, platypus, pig, cat, bush baby, common shrew, microbat european hedgehog; fish stickleback medaka second example sea squirt (Ciona savignyi) mosquito (Aedes aegypti). Some major features added during include first...