- Biomedical Text Mining and Ontologies
- Genomics and Phylogenetic Studies
- Bioinformatics and Genomic Networks
- Genomics and Rare Diseases
- Cancer-related gene regulation
- Gene expression and cancer classification
- Genetic and phenotypic traits in livestock
- Enzyme function and inhibition
- CRISPR and Genetic Engineering
- Polyamine Metabolism and Applications
- Epigenetics and DNA Methylation
- RNA and protein synthesis mechanisms
- Genomics and Chromatin Dynamics
- Molecular Biology Techniques and Applications
- Coagulation, Bradykinin, Polyphosphates, and Angioedema
- RNA modifications and cancer
- Pluripotent Stem Cells Research
- Genetic diversity and population structure
- Plant Molecular Biology Research
- Ion channel regulation and function
- Nematode management and characterization studies
- Cancer Genomics and Diagnostics
- Genetics, Aging, and Longevity in Model Organisms
- Genetics and Neurodevelopmental Disorders
- Connective Tissue Growth Factor Research
European Bioinformatics Institute
2016-2023
University of Cambridge
2008-2018
University College London
1987-2018
Phoenix Bioinformatics
2018
Queen Mary University of London
2018
Medical College of Wisconsin
2018
University of Oxford
2018
University of Oregon
2018
Jackson Laboratory
2018
California Institute of Technology
2014
FlyBase (http://flybase.org) is a database of Drosophila genetic and genomic information. Gene Ontology (GO) terms are used to describe three attributes wild-type gene products: their molecular function, the biological processes in which they play role, subcellular location. This article describes recent changes GO annotation strategy that improving quality data. Many these stem from our participation Reference Genome Annotation Project--a multi-database collaboration producing comprehensive...
The Gene Ontology (GO) Consortium (GOC, http://www.geneontology.org) is a community-based bioinformatics resource that classifies gene product function through the use of structured, controlled vocabularies. Over past year, GOC has implemented several processes to increase quantity, quality and specificity GO annotations. First, number manual, literature-based annotations grown at an increasing rate. Second, as result new 'phylogenetic annotation' process, manually reviewed, homology-based...
The HUGO Gene Nomenclature Committee (HGNC) based at the European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. Currently HGNC database contains almost 40 000 approved gene symbols, over 19 of which represent protein-coding In addition naming genomic loci we manually curate genes into family sets on shared characteristics such as homology, function or phenotype. We have recently updated our resources introduced new improved visualizations can be seen...
The HUGO Gene Nomenclature Committee (HGNC) based at EMBL's European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. There are over 40 000 approved gene in our current database of which 19 for protein-coding Vertebrate (VGNC) was established 2016 assign standardized nomenclature line with vertebrate species that lack their own committees. VGNC initially assigned 15000 genes chimpanzee. We have extended this process other species, naming 14000 cow dog 13...
Abstract The HUGO Gene Nomenclature Committee (HGNC) based at EMBL’s European Bioinformatics Institute (EMBL-EBI) assigns unique symbols and names to human genes. There are over 42,000 approved gene in our current database of which 19 000 for protein-coding While we still update placeholder problematic symbols, working towards stabilizing where possible; 2000 disease associated genes now marked as stable symbol reports. All data is available the HGNC website https://www.genenames.org....
Abstract The HUGO Gene Nomenclature Committee (HGNC) assigns unique symbols and names to human genes. HGNC database (www.genenames.org) currently contains over 43 000 approved gene symbols, 19 200 of which are assigned protein-coding genes, 14 pseudogenes nearly 9000 non-coding RNA public website, www.genenames.org, displays all nomenclature within Symbol Reports that contain data curated by advisors links related genomic, clinical, proteomic information. Here, we describe updates our...
The human genome contains 25 genes coding for selenocysteine-containing proteins (selenoproteins). These are involved in a variety of functions, most notably redox homeostasis. Selenoprotein enzymes with known functions designated according to these functions: TXNRD1, TXNRD2, and TXNRD3 (thioredoxin reductases), GPX1, GPX2, GPX3, GPX4, GPX6 (glutathione peroxidases), DIO1, DIO2, DIO3 (iodothyronine deiodinases), MSRB1 (methionine sulfoxide reductase B1), SEPHS2 (selenophosphate synthetase...
AbstractPatterns of DNA methylation in animal genomes are known to vary from an apparent absence modified bases, via a minor fraction the genome, genome-wide methylation. Representative 10 invertebrate phyla comprise predominantly nonmethylated and (usually but not always) methylated DNA. In contrast, all 27 vertebrate that have been examined display Our studies chordate suggest transition fractional global occurred close origin vertebrates, as amphioxus has typically pattern whereas...
The transcription factor Sox1 is the earliest and most specific known marker for mammalian neural progenitors. During fetal development, expressed by proliferating progenitor cells throughout central nervous system in no tissue but lens. We generated a reporter mouse line which egfp inserted into locus. GFP animals faithfully recapitulate expression of endogenous gene. have used to purify neuroepithelial fluorescence-activated cell sorting from embryonic day 10.5 embryos. RNAs prepared GFP+...
The Gene Ontology (GO) (http://www.geneontology.org) is a community bioinformatics resource that represents gene product function through the use of structured, controlled vocabularies. number GO annotations products has increased due to curation efforts among Consortium (GOC) groups, including focused literature-based annotation and ortholog-based functional inference. ontologies continue expand improve as result targeted ontology development, introduction computable logical definitions...
The Gene Ontology (GO) is a collaborative effort that provides structured vocabularies for annotating the molecular function, biological role, and cellular location of gene products in highly systematic way species-neutral manner with aim unifying representation function across different organisms. Each contributing member GO Consortium independently associates terms to from organism(s) they are annotating. Here we introduce Reference Genome project, which brings together those independent...
The most widely appreciated role of DNA is to encode protein, yet the exact portion human genome that translated remains be ascertained. We previously developed PhyloCSF, a used tool identify evolutionary signatures protein-coding regions using multispecies alignments. Here, we present first whole-genome PhyloCSF prediction tracks for human, mouse, chicken, fly, worm, and mosquito. develop workflow uses machine learning predict novel conserved efficiently guide their manual curation. analyze...
FlyTF (http://www.flytf.org) is a database of computationally predicted and/or experimentally verified site-specific transcription factors (TFs) in the fruit fly Drosophila melanogaster. The manual classification TFs initial version that concentrated primarily on DNA-binding characteristics proteins has now been extended to more fine-grained annotation both DNA binding and regulatory properties new release. Furthermore, experimental evidence from literature was classified into defined...
Gene ontology (GO) annotation is a common task among model organism databases (MODs) for capturing gene function data from journal articles. It time-consuming and labor-intensive task, thus often considered as one of the bottlenecks in literature curation. There growing need semiautomated or fully automated GO curation techniques that will help database curators to rapidly accurately identify information full-length Despite multiple attempts past, few studies have proven be useful with...