Burkhard Rost

ORCID: 0000-0003-0179-8424
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Machine Learning in Bioinformatics
  • Protein Structure and Dynamics
  • Genomics and Phylogenetic Studies
  • RNA and protein synthesis mechanisms
  • Enzyme Structure and Function
  • Genetics, Bioinformatics, and Biomedical Research
  • Bioinformatics and Genomic Networks
  • Microbial Metabolic Engineering and Bioproduction
  • Cancer Research and Treatments
  • Bacterial Genetics and Biotechnology
  • Glycosylation and Glycoproteins Research
  • Computational Drug Discovery Methods
  • Advanced Proteomics Techniques and Applications
  • Microbial Metabolism and Applications
  • Bacteriophages and microbial interactions
  • Genomics and Rare Diseases
  • RNA modifications and cancer
  • Biomedical Text Mining and Ontologies
  • Metabolomics and Mass Spectrometry Studies
  • Molecular Biology Techniques and Applications
  • Enzyme Production and Characterization
  • Microbial Community Ecology and Physiology
  • Microbial Natural Products and Biosynthesis
  • Gene expression and cancer classification
  • Cancer Genomics and Diagnostics

Technical University of Munich
2016-2025

Institute for Advanced Study
2016-2025

Columbia University
2013-2024

Weihenstephan-Triesdorf University of Applied Sciences
2013-2024

New York Structural Biology Center
2010-2021

Institute for Advanced Study
2019

Klinikum rechts der Isar
2013-2018

International Society for Computational Biology
2008-2015

SRI International
2015

Max Planck Institute for Informatics
2007-2015

Sequence alignments unambiguously distinguish between protein pairs of similar and non-similar structure when the pairwise sequence identity is high (>40% for long alignments). The signal gets blurred in twilight zone 20–35% identity. Here, more than a million were analysed known structures to re-define line distinguishing true false positives low levels similarity. Four results stood out. (i) transition from safe alignment into described by an explosion negatives. More 95% all detected had...

10.1093/protein/12.2.85 article EN Protein Engineering Design and Selection 1999-02-01

Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of networks by using additional derived from alignments. a position-specific conservation weight part the increases performance. number insertions and deletions reduces tendency for overprediction overall Addition global amino acid content yields further improvement,...

10.1002/prot.340190108 article EN Proteins Structure Function and Bioinformatics 1994-05-01

PredictProtein (http://www.predictprotein.org) is an Internet service for sequence analysis and the prediction of protein structure function. Users submit sequences or alignments; returns multiple alignments, PROSITE motifs, low-complexity regions (SEG), nuclear localization signals, lacking regular (NORS) predictions secondary structure, solvent accessibility, globular regions, transmembrane helices, coiled-coil structural switch disulfide-bonds, sub-cellular functional annotations. Upon...

10.1093/nar/gkh377 article EN Nucleic Acids Research 2004-07-01

10.1016/s0076-6879(96)66033-9 article EN Methods in enzymology on CD-ROM/Methods in enzymology 1996-01-01

Computational biology and bioinformatics provide vast data gold-mines from protein sequences, ideal for Language Models (LMs) taken Natural Processing (NLP). These LMs reach new prediction frontiers at low inference costs. Here, we trained two auto-regressive models (Transformer-XL, XLNet) four auto-encoder (BERT, Albert, Electra, T5) on UniRef BFD containing up to 393 billion amino acids. The (pLMs) were the Summit supercomputer using 5616 GPUs TPU Pod up-to 1024 cores. Dimensionality...

10.1109/tpami.2021.3095381 article EN cc-by IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-07-07
Predrag Radivojac Wyatt T. Clark Tal Oron Alexandra M. Schnoes Tobias Wittkop and 95 more Artem Sokolov Kiley Graim Christopher S. Funk Karin Verspoor Asa Ben‐Hur Gaurav Pandey Jeffrey M. Yunes Ameet Talwalkar Susanna Repo Michael L Souza Damiano Piovesan Rita Casadio Zheng Wang Jianlin Cheng Hai Fang Julian Gough Patrik Koskinen Petri Törönen Jussi Nokso-Koivisto Liisa Holm Domenico Cozzetto Daniel Buchan Kevin Bryson David T. Jones Bhakti Limaye Harshal Inamdar Avik Datta Sunitha K Manjari Rajendra Joshi Meghana Chitale Daisuke Kihara Andreas Martin Lisewski Serkan Erdin Eric Venner Olivier Lichtarge Robert Rentzsch Haixuan Yang Alfonso E. Romero Prajwal Bhat Alberto Paccanaro Tobias Hamp Rebecca Kaßner Stefan Seemayer Esmeralda Vicedo Christian Schaefer Dominik Achten Florian Auer Ariane C. Boehm Tatjana Braun Maximilian Hecht B. Mark Heron Peter Hönigschmid Thomas A. Hopf Stefanie Kaufmann Michael Kiening Denis Krompaß Cedric Landerer Yannick Mahlich Manfred Roos Jari Björne Tapio Salakoski Andrew Wong Hagit Shatkay Fanny Gatzmann I. Sommer Mark N. Wass Michael J.E. Sternberg Nives Škunca Fran Supek Matko Bošnjak Panče Panov Sašo Džeroski Tomislav Šmuc Yiannis Kourmpetis Aalt D. J. van Dijk Cajo J. F. ter Braak Yuanpeng Zhou Qingtian Gong Xinran Dong Weidong Tian Marco Falda Paolo Fontana Enrico Lavezzo Barbara Di Camillo Stefano Toppo Liang Lan Nemanja Djuric Yuhong Guo Slobodan Vučetić Amos Bairoch Michal Linial Patricia C. Babbitt Steven E. Brenner Christine Orengo Burkhard Rost

Automated annotation of protein function is challenging. As the number sequenced genomes rapidly grows, overwhelming majority products can only be annotated computationally. If computational predictions are to relied upon, it crucial that accuracy these methods high. Here we report results from first large-scale community-based critical assessment (CAFA) experiment. Fifty-four representing state art for prediction were evaluated on a target set 866 proteins 11 organisms. Two findings stand...

10.1038/nmeth.2340 article EN cc-by-nc-sa Nature Methods 2013-01-27

Many genetic variations are single nucleotide polymorphisms (SNPs). Non-synonymous SNPs 'neutral' if the resulting point-mutated protein is not functionally discernible from wild type and 'non-neutral' otherwise. The ability to identify non-neutral substitutions could significantly aid targeting disease causing detrimental mutations, as well that increase fitness of particular phenotypes. Here, we introduced comprehensive data sets assess performance methods predict SNP effects. Along SNAP...

10.1093/nar/gkm238 article EN cc-by-nc Nucleic Acids Research 2007-05-07

Abstract Cell-to-cell communication across multiple cell types and tissues strictly governs proper functioning of metazoans extensively relies on interactions between secreted ligands cell-surface receptors. Herein, we present the first large-scale map cell-to-cell 144 human primary types. We reveal that most cells express tens to hundreds receptors create a highly connected signalling network through ligand–receptor paths. also observe extensive autocrine with approximately two-thirds...

10.1038/ncomms8866 article EN cc-by Nature Communications 2015-07-22

Abstract Secondary structure predictions are increasingly becoming the workhorse for several methods aiming at predicting protein and function. Here we use ensembles of bidirectional recurrent neural network architectures, PSI‐BLAST‐derived profiles, a large nonredundant training set to derive two new predictors: (a) second version SSpro program secondary classification into three categories (b) first SSpro8 eight classes produced by DSSP program. We describe results different test sets on...

10.1002/prot.10082 article EN Proteins Structure Function and Bioinformatics 2002-03-01

10.1093/embo-reports/kvd092 article EN EMBO Reports 2000-11-01

Abstract We describe a neural network system that predicts the locations of transmembrane helices in integral membrane proteins. By using evolutionary information as input to system, method significantly improved on previously published prediction had been based single sequence information. The data were derived from multiple alignments for each position window 13 adjacent residues: amino acid frequency, conservation weights, number insertions and deletions, with respect ends protein chain....

10.1002/pro.5560040318 article EN Protein Science 1995-03-01

Abstract Currently, the prediction of three‐dimensional (3D) protein structure from sequence alone is an exceedingly difficult task. As intermediate step, a much simpler task has been pursued extensively: predicting 1D strings secondary structure. Here, we present analysis another projection 3D structure: relative solvent accessibility each residue. We show that less conserved in homologues than structure, and hence predicted accurately automatic homology modeling; correlation coefficient...

10.1002/prot.340200303 article EN Proteins Structure Function and Bioinformatics 1994-11-01

Abstract Previously, we introduced a neural network system predicting locations of transmembrane helices (HTMs) based on evolutionary profiles (PHDhtm, Rost B, Casadio R, Fariselli P, Sander C, 1995, Protein Sci 4 :521–533). Here, describe an improvement and extension that system. The is achieved by dynamic programming‐like algorithm optimizes compatible with the output. prediction topology (orientation first loop region respect to membrane) applying refined observation positively charged...

10.1002/pro.5560050824 article EN Protein Science 1996-08-01

Elucidating the effects of naturally occurring genetic variation is one major challenges for personalized health and medicine. Here, we introduce SNAP2, a novel neural network based classifier that improves over state-of-the-art in distinguishing between effect neutral variants. Our method's improved performance results from screening many potentially relevant protein features refining our development data sets. Cross-validated on >100k experimentally annotated variants, SNAP2 significantly...

10.1186/1471-2164-16-s8-s1 article EN cc-by BMC Genomics 2015-06-18

PredictProtein is a meta-service for sequence analysis that has been predicting structural and functional features of proteins since 1992. Queried with protein it returns: multiple alignments, predicted aspects structure (secondary structure, solvent accessibility, transmembrane helices (TMSEG) strands, coiled-coil regions, disulfide bonds disordered regions) function. The service incorporates methods the identification regions (ConSurf), homology-based inference Gene Ontology terms...

10.1093/nar/gku366 article EN cc-by Nucleic Acids Research 2014-05-05

By the middle of 1993, >30 000 protein sequences had been listed. For 1000 these, three-dimensional (tertiary) structure has experimentally solved. Another 7000 can be modelled by homology. remaining 21 sequences, secondary prediction provides a rough estimate structural features. Predictions in three states range between 35% (random) and 88% (homology modelling) overall accuracy. Using information about evolutionary conservation as contained multiple sequence alignments, 4700 was predicted...

10.1093/bioinformatics/10.1.53 article EN Bioinformatics 1994-01-01

The explosive accumulation of protein sequences in the wake large-scale sequencing projects is stark contrast to much slower experimental determination structures. Improved methods structure prediction from gene sequence alone are therefore needed. Here, we report a substantial increase both accuracy and quality secondary-structure predictions, using neural-network algorithm. main improvements come use multiple alignments (better overall accuracy), "balanced training" beta-strands),...

10.1073/pnas.90.16.7558 article EN Proceedings of the National Academy of Sciences 1993-08-15

Predicting protein function and structure from sequence is one important challenge for computational biology. For 26 years, most state-of-the-art approaches combined machine learning evolutionary information. However, some applications retrieving related proteins becoming too time-consuming. Additionally, information less powerful small families, e.g. the Dark Proteome. Both these problems are addressed by new methodology introduced here.We a novel way to represent sequences as continuous...

10.1186/s12859-019-3220-8 article EN cc-by BMC Bioinformatics 2019-12-01

10.1016/s0022-2836(02)01223-8 article EN Journal of Molecular Biology 2002-12-27

10.1016/s0022-2836(02)00016-5 article EN Journal of Molecular Biology 2002-04-01

This paper is an introduction to the supplemental issue of journal PROTEINS, dedicated seventh CASP experiment assess state art in protein structure prediction. The describes conduct experiment, categories prediction included, and outlines evaluation assessment procedures. Highlights are improvements model accuracy relative that obtainable from knowledge a single best template structure; convergence models produced by automatic servers toward human modeling teams; emergence methods for...

10.1002/prot.21767 article EN other-oa Proteins Structure Function and Bioinformatics 2007-01-01
Coming Soon ...