NFDI4DS | UHH-SEMS - Publication Details

James Cuff

ORCID: 0000-0003-0570-4106

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5013280530

Research Areas

Genomics and Phylogenetic Studies
RNA and protein synthesis mechanisms
Chromosomal and Genetic Variations
Scientific Computing and Data Management
Genomics and Chromatin Dynamics
Enzyme Structure and Function
Protein Structure and Dynamics
Genetic Mapping and Diversity in Plants and Animals
Genetics, Bioinformatics, and Biomedical Research
Glycosylation and Glycoproteins Research
Software-Defined Networks and 5G
Network Security and Intrusion Detection
Machine Learning in Bioinformatics
IoT and Edge/Fog Computing
Cloud Computing and Resource Management
Smart Grid Security and Resilience
RNA Research and Splicing
Model-Driven Software Engineering Techniques
Research Data Management Practices
Animal Genetics and Reproduction
Distributed and Parallel Computing Systems
Algorithms and Data Compression
Bioinformatics and Genomic Networks
Cloud Data Security Solutions
Epigenetics and DNA Methylation

University of Liverpool
2020-2023

Carnegie Mellon University
2020-2023

IBM (United Kingdom)
2020-2023

Qinetiq (United Kingdom)
2020-2023

University of Toronto
2020-2023

IBM Research - Thomas J. Watson Research Center
2020-2023

Boston University
2020

Harvard University
2011-2018

Harvard University Press
2007-2017

Baylor Genetics
2007

Initial sequencing and comparative analysis of the mouse genome

OPENALEX - Publications

R Waterston Kerstin Lindblad‐Toh Ewan Birney Jane Rogers Josep F. Abril and 95 more

10.1038/nature01262 article EN Nature 2002-12-01

A Bivalent Chromatin Structure Marks Key Developmental Genes in Embryonic Stem Cells

OPENALEX - Publications

B Bernstein Tarjei S. Mikkelsen Xiaohui Xie Michael Kamal Dana J. Huebert and 10 more

10.1016/j.cell.2006.02.041 article EN publisher-specific-oa Cell 2006-04-01

Genome sequence, comparative analysis and haplotype structure of the domestic dog

OPENALEX - Publications

Kerstin Lindblad‐Toh Claire M. Wade Tarjei S. Mikkelsen Elinor K. Karlsson David B. Jaffe and 41 more

10.1038/nature04338 article EN Nature 2005-12-01

The Ensembl genome database project

OPENALEX - Publications

Tim Hubbard Darren Barker Ewan Birney Graham Cameron Y. Chen and 30 more

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is comprehensive source stable automatic annotation human genome sequence, with confirmed gene predictions that have been integrated external data sources, and available as either an interactive web site or flat files. also open software engineering develop portable system able handle very genomes associated requirements from sequence...

10.1093/nar/30.1.38 article EN Nucleic Acids Research 2002-01-01

The Jalview Java alignment editor

OPENALEX - Publications

Michèle Clamp James Cuff Stephen M. J. Searle Geoffrey J. Barton

Multiple sequence alignment remains a crucial method for understanding the function of groups related nucleic acid and protein sequences. However, it is known that automatic multiple alignments can often be improved by manual editing. Therefore, tools are needed to view edit alignments. Due growth in databases, large difficult efficiently. The Jalview Java editor presented here, which enables fast viewing editing

10.1093/bioinformatics/btg430 article EN Bioinformatics 2004-01-22

A high-resolution map of human evolutionary constraint using 29 mammals

OPENALEX - Publications

Kerstin Lindblad‐Toh Manuel Garber Or Zuk Michael F. Lin Brian J. Parker and 55 more

The comparison of related genomes has emerged as a powerful lens for genome interpretation. Here we report the sequencing and comparative analysis 29 eutherian genomes. We confirm that at least 5.5% human undergone purifying selection, locate constrained elements covering ∼4.2% genome. use evolutionary signatures comparisons with experimental data sets to suggest candidate functions ∼60% bases. These reveal small number new coding exons, stop codon readthrough events over 10,000 regions...

10.1038/nature10530 article EN cc-by-nc-sa Nature 2011-10-01

JPred: a consensus secondary structure prediction server.

OPENALEX - Publications

James Cuff M. Clamp A.S. Siddiqui M. Raymond V. Finlay Geoffrey J. Barton

An interactive protein secondary structure prediction Internet server is presented. The allows a single sequence or multiple alignment to be submitted, and returns predictions from six algorithms that exploit evolutionary information sequences. A consensus also returned which improves the average Q3 accuracy of by 1% 72.9%. simplifies use current conservation patterns important function identified.http://barton.ebi.ac.uk/servers/jpred.h tmlgeoff@ebi.ac.uk

10.1093/bioinformatics/14.10.892 article EN Bioinformatics 1998-01-01

Application of multiple sequence alignment profiles to improve protein secondary structure prediction

OPENALEX - Publications

James Cuff Geoffrey J. Barton

The effect of training a neural network secondary structure prediction algorithm with different types multiple sequence alignment profiles derived from the same sequences, is shown to provide range accuracy 70.5% 76.4%. best 76.4% (standard deviation 8.4%), 3.1% (Q(3)) and 4.4% (SOV2) better than PHD run on set 406 non-redundant proteins that were not used train either method. Residues predicted by new method confidence value 5 or greater, have an average Q(3) 84%, cover 68% residues....

10.1002/1097-0134(20000815)40:3<502::aid-prot170>3.0.co;2-q article EN Proteins Structure Function and Bioinformatics 2000-01-01

Genome of the marsupial Monodelphis domestica reveals innovation in non-coding sequences

OPENALEX - Publications

Tarjei S. Mikkelsen Matthew J. Wakefield Bronwen Aken Chris T. Amemiya Jean L. Chang and 57 more

10.1038/nature05805 article EN Nature 2007-05-01

Evaluation and improvement of multiple sequence methods for protein secondary structure prediction

OPENALEX - Publications

James Cuff Geoffrey J. Barton

A new dataset of 396 protein domains is developed and used to evaluate the performance secondary structure prediction algorithms DSC, PHD, NNSSP, PREDATOR. The maximum theoretical Q3 accuracy for combination these methods shown be 78%. simple consensus on domains, with automatically generated multiple sequence alignments gives an average 72.9%. This a 1% improvement over which was best single method evaluated. Segment Overlap Accuracy (SOV) 75.4% 396-protein set. definition DSSP defines 8...

10.1002/(sici)1097-0134(19990301)34:4<508::aid-prot10>3.0.co;2-4 article EN Proteins Structure Function and Bioinformatics 1999-03-01

Distinguishing protein-coding and noncoding genes in the human genome

OPENALEX - Publications

Michèle Clamp Ben Fry Mike Kamal Xiaohui Xie James Cuff and 4 more

Although the Human Genome Project was completed 4 years ago, catalog of human protein-coding genes remains a matter controversy. Current catalogs list total approximately 24,500 putative genes. It is broadly suspected that large fraction these entries are functionally meaningless ORFs present by chance in RNA transcripts, because they show no evidence evolutionary conservation with mouse or dog. However, there currently scientific justification for excluding simply fail to conservation:...

10.1073/pnas.0709013104 article EN Proceedings of the National Academy of Sciences 2007-11-27

Analyses of deep mammalian sequence alignments and constraint predictions for 1% of the human genome

OPENALEX - Publications

Elliott H. Margulies Gregory M. Cooper George Asimenos Daryl J. Thomas Colin N. Dewey and 72 more

A key component of the ongoing ENCODE project involves rigorous comparative sequence analyses for initially targeted 1% human genome. Here, we present orthologous generation, alignment, and evolutionary constraint 23 mammalian species all targets. Alignments were generated using four different methods; comparisons these methods reveal large-scale consistency but substantial differences in terms small genomic rearrangements, sensitivity (sequence coverage), specificity (alignment accuracy)....

10.1101/gr.6034307 article EN cc-by-nc Genome Research 2007-06-01

Ensembl 2004

OPENALEX - Publications

Ewan Birney Dan Andrews A. Paul Bevan Mario Cáccamo Graham Cameron and 43 more

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organize biology around the sequences of large genomes. It is comprehensive and integrated source annotation genome sequences, available via interactive website, web services or flat files. As well as being one leading sources annotation, an open software engineering develop portable system able handle very genomes associated requirements. facilities range from sequence analysis data storage...

10.1093/nar/gkh038 article EN Nucleic Acids Research 2003-12-17

The Ensembl Computing Architecture

OPENALEX - Publications

James Cuff Guy Coates Tim Cutts Mark Rae

Ensembl is a software project to automatically annotate large eukaryotic genomes and release them freely into the public domain. The currently annotates 10 complete genomes. This makes very demands on compute resources, due vast number of sequence comparisons that need be executed. To circumvent financial outlay often associated with classical supercomputing environments, farms multiple, lower-cost machines have now become norm been deployed successfully this project. architecture design...

10.1101/gr.1866304 article EN cc-by-nc Genome Research 2004-05-01

The medical science DMZ: a network design pattern for data-intensive medical science

OPENALEX - Publications

Sean Peisert Eli Dart William K. Barnett Edward Balas James Cuff and 4 more

We describe a detailed solution for maintaining high-capacity, data-intensive network flows (eg, 10, 40, 100 Gbps+) in scientific, medical context while still adhering to security and privacy laws regulations.

10.1093/jamia/ocx104 article EN cc-by-nc Journal of the American Medical Informatics Association 2017-08-31

The Medical Science DMZ

OPENALEX - Publications

Sean Peisert William K. Barnett Eli Dart James Cuff Robert L. Grossman and 4 more

Abstract Objective We describe use cases and an institutional reference architecture for maintaining high-capacity, data-intensive network flows (e.g., 10, 40, 100 Gbps+) in a scientific, medical context while still adhering to security privacy laws regulations. Materials Methods High-end networking, packet filter firewalls, intrusion detection systems. Results “Medical Science DMZ” concept as option secure, high-volume transport of large, sensitive data sets between research institutions...

10.1093/jamia/ocw032 article EN cc-by-nc Journal of the American Medical Informatics Association 2016-05-02

A High-Availability Cloud for Research Computing

OPENALEX - Publications

Justin Riley John Noss Wes Dillingham James Cuff Ignacio M. Llórente

This article describes the lessons learned, challenges faced, and innovations made in designing implementing a high-availability private cloud for research computing.

10.1109/mc.2017.182 article EN Computer 2017-01-01

ProtEST: protein multiple sequence alignments from expressed sequence tags

OPENALEX - Publications

James Cuff Ewan Birney Michèle Clamp Geoffrey J. Barton

An automatic sequence searching method (ProtEST) is described which constructs multiple protein alignments from sequences and translated expressed tags (ESTs). ProtEST more effective than a simple TBLASTN search of the query against EST database, as are automatically clustered, assembled, made non-redundant, checked for errors, into then aligned displayed.A found translated, error- length-corrected > 58% when single 1407 Pfam-A seed were used probe. The average family size resulting...

10.1093/bioinformatics/16.2.111 article EN Bioinformatics 2000-02-01

Welcome to the first issue of Applied AI Letters

OPENALEX - Publications

Edward O. Pyzer‐Knapp James Cuff Jack Patterson Olexandr Isayev Simon Maskell

It is a pleasure and an honour to welcome you the first edition of Applied AI Letters. Getting this point has been combination many people's hard work we are very excited move into next stage, sharing our vision for Letters with you. When consider lifecycle successful idea, can identify some unique stages. We have put these together in Figure 1. Initially, challenge should be identified, it often (although not always) case that there idea impact solving have. If ignites scientific spirit,...

10.1002/ail2.8 article EN cc-by Applied AI Letters 2020-08-04

The 500 builds of 300 applications in the HeLmod repository will at least get you started on a full suite of scientific applications

OPENALEX - Publications

Aaron Kitzmiller John Brunelle Michèle Clamp James Cuff

10.7490/f1000research.1110117.1 article EN F1000Research 2015-07-24

Coming Soon ...