Jennifer Harrow
- Genomics and Phylogenetic Studies
- RNA and protein synthesis mechanisms
- RNA modifications and cancer
- Genomics and Chromatin Dynamics
- RNA Research and Splicing
- Genetics, Bioinformatics, and Biomedical Research
- Cancer-related molecular mechanisms research
- Chromosomal and Genetic Variations
- Machine Learning in Bioinformatics
- Olfactory and Sensory Function Studies
- Scientific Computing and Data Management
- Genetic Mapping and Diversity in Plants and Animals
- Animal Genetics and Reproduction
- Research Data Management Practices
- Genomics and Rare Diseases
- Molecular Biology Techniques and Applications
- Biochemical Analysis and Sensing Techniques
- Genetic and phenotypic traits in livestock
- Advanced Proteomics Techniques and Applications
- Gene expression and cancer classification
- Immune Cell Function and Interaction
- Epigenetics and DNA Methylation
- T-cell and B-cell Immunology
- Animal Virus Infections Studies
- Genomic variations and chromosomal abnormalities
AstraZeneca (Brazil)
2024
AstraZeneca (United Kingdom)
2023-2024
Genomics (United Kingdom)
2023-2024
Wellcome Sanger Institute
2013-2022
European Bioinformatics Institute
2007-2019
Illumina (United Kingdom)
2017-2019
Max Planck Institute for Developmental Biology
2013
University of California, Santa Cruz
2012-2013
University College London
2013
University of Cambridge
2013
Eukaryotic cells make many types of primary and processed RNAs that are found either in specific subcellular compartments or throughout the cells. A complete catalogue these is not yet available their characteristic localizations also poorly understood. Because RNA represents direct output genetic information encoded by genomes a significant proportion cell's regulatory capabilities focused on its synthesis, processing, transport, modification translation, generation such crucial for...
The human genome contains many thousands of long noncoding RNAs (lncRNAs). While several studies have demonstrated compelling biological and disease roles for individual examples, analytical experimental approaches to investigate these genes been hampered by the lack comprehensive lncRNA annotation. Here, we present analyze most complete annotation date, produced GENCODE consortium within framework ENCODE project comprising 9277 manually annotated producing 14,880 transcripts. Our analyses...
The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since first public release this annotation data set, few new protein-coding loci have been added, yet number alternative splicing transcripts annotated has steadily increased. 7 contains 20,687 9640 long noncoding RNA 33,977 coding not represented UCSC genes RefSeq. It also most comprehensive (lncRNA) publicly available...
For 10,000 years pigs and humans have shared a close complex relationship. From domestication to modern breeding practices, shaped the genomes of domestic pigs. Here we present assembly analysis genome sequence female Duroc pig (Sus scrofa) comparison with wild from Europe Asia. Wild emerged in South East Asia subsequently spread across Eurasia. Our results reveal deep phylogenetic split between European Asian boars ∼1 million ago, selective sweep indicates selection on genes involved RNA...
The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates key model organisms. It provides data 87 species across our main early Pre! websites. This year we introduced three newly annotated released numerous updates supported with concentration on latest assemblies human, mouse, zebrafish rat. We also provided two previous human assembly, GRCh37, through dedicated...
Defective Gene Detective Identifying genes that give rise to diseases is one of the major goals sequencing human genomes. However, putative loss-of-function genes, which are often some first identified targets genome and exome sequencing, have turned out be errors rather than true genetic variants. In order identify scope within genome, MacArthur et al. (p. 823 ; see Perspective by Quintana-Murci ) extensively validated genomes from 1000 Genomes Project, as well an additional European...
Ensembl (http://www.ensembl.org) creates tools and data resources to facilitate genomic analysis in chordate species with an emphasis on human, major vertebrate model organisms farm animals. Over the past year we have increased number of that support 77 expanded our genome browser a new scrollable overview improved variation phenotype views. We also report updates core datasets improvements gene homology relationships from addition species. Our REST service has been extended additional for...
Ensembl (http://www.ensembl.org) is a genomic interpretation system providing the most up-to-date annotations, querying tools and access methods for chordates key model organisms. This year we released updated annotation (gene models, comparative genomics, regulatory regions variation) on new human assembly, GRCh38, although continue to support researchers using GRCh37.p13 assembly through dedicated site (http://grch37.ensembl.org). Our Regulatory Build has been revamped identify of interest...
The Ensembl project (http://www.ensembl.org) provides genome information for sequenced chordate genomes with a particular focus on human, mouse, zebrafish and rat. Our resources include evidenced-based gene sets all supported species; large-scale whole multiple species alignments across vertebrates clade-specific eutherian mammals, primates, birds fish; variation data 17 regulation annotations based ENCODE other sets. are accessible through the browser at http://www.ensembl.org tools...
The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human data as well key model organisms such mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) Tasmanian devil (Sarcophilus harrisii) bringing total number of supported to 61 release 64 (September 2011). Of these, 55 appear main website six are provided preview site (Pre!Ensembl; http://pre.ensembl.org)...
We evaluated 25 protocol variants of 14 independent computational methods for exon identification, transcript reconstruction and expression-level quantification from RNA-seq data. Our results show that most algorithms are able to identify discrete components with high success rates but assembly complete isoform structures poses a major challenge even when all constituent elements identified. Expression-level estimates also varied widely across methods, based on similar models. Consequently,...
The GENCODE consortium was formed to identify and map all protein-coding genes within the ENCODE regions. This achieved by a combination of initial manual annotation HAVANA team, experimental validation refinement based on these results.The gene features are divided into eight different categories which only first two (known novel coding sequence) confidently predicted be genes. 5' rapid amplification cDNA ends (RACE) RT-PCR were used experimentally verify annotation. Of 420 loci tested, 229...
Effective use of the human and mouse genomes requires reliable identification genes their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation genes, transcripts, proteins. The collaborative consensus coding sequence (CCDS) project tracks protein annotations on reference with a stable identifier (CCDS ID), ensures they consistently represented NCBI, Ensembl, UCSC Genome Browsers. Importantly,...
Determining the full complement of protein-coding genes is a key goal genome annotation. The most powerful approach for confirming potential detection cellular protein expression through peptide mass spectrometry (MS) experiments. Here, we mapped peptides detected in seven large-scale proteomics studies to almost 60% GENCODE annotation human genome. We found strong relationship between experiments and both gene family age cross-species conservation. Most which were highly conserved. >96%...
Authors compare RNA-seq aligners on mouse and human data sets using benchmarks such as alignment yield, splice junction accuracy suitability for transcript reconstruction. The work highlights the strength of each program discusses outstanding needs in analysis. High-throughput RNA sequencing is an increasingly accessible method studying gene structure activity a genome-wide scale. A critical step analysis partial reads to reference genome sequence. To assess performance current mapping...
The Vertebrate Genome Annotation (Vega) database ( http://vega.sanger.ac.uk ) was first made public in 2004 and has been designed to view manual annotation of human, mouse zebrafish genomic sequences produced at the Wellcome Trust Sanger Institute. Since its initial release, number human annotated loci more than doubled close 33 000 now contains comprehensive on 20 24 chromosomes, four whole chromosomes around 40% Danio rerio genome. In addition, we offer a haplotype regions comparative...
Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent evidence suggests that many of them might some form biological activity, and the possibility functionality has increased interest in their accurate annotation integration with functional genomics data. As part GENCODE human genome, we present first genome-wide pseudogene assignment for protein-coding genes, based on both large-scale manual silico pipelines. A key aspect this coupled approach is it...
The human major histocompatibility complex (MHC) is contained within about 4 Mb on the short arm of chromosome 6 and recognised as most variable region in genome. primary aim MHC Haplotype Project was to provide a comprehensively annotated reference sequence single, leukocyte antigen-homozygous haplotype use it basis against which variations could be assessed from seven other similarly homozygous cell lines, representative common haplotypes European population. Comparison sequences,...
Uniform processing and detailed annotation of human, worm fly RNA-sequencing data reveal ancient, conserved features the transcriptome, shared co-expression modules (many enriched in developmental genes), matched expression patterns across development similar extent non-canonical, non-coding transcription; furthermore, are used to create a single, universal model predict gene-expression levels for all three organisms from chromatin at promoter. In this paper modENCODE consortium reports on...
The FAIR Guiding Principles, published in 2016, aim to improve the findability, accessibility, interoperability and reusability of digital research objects for both humans machines.Until now principles have been mostly applied data.The ideas behind these are, however, also directly relevant software.Hence there is a distinct need explore how can be software.In this work, we summarize current status debate around software, as basis development community-agreed software future.We discuss what...