NFDI4DS | UHH-SEMS - Publication Details

Johannes Söding

ORCID: 0000-0001-9642-8244

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5010030898

Research Areas

Genomics and Phylogenetic Studies
RNA and protein synthesis mechanisms
Protein Structure and Dynamics
Machine Learning in Bioinformatics
Genomics and Chromatin Dynamics
RNA Research and Splicing
Bacteriophages and microbial interactions
RNA modifications and cancer
Microbial Community Ecology and Physiology
Enzyme Structure and Function
Cold Atom Physics and Bose-Einstein Condensates
Bioinformatics and Genomic Networks
Advanced Proteomics Techniques and Applications
Glycosylation and Glycoproteins Research
Gene expression and cancer classification
Atomic and Subatomic Physics Research
Bacterial Genetics and Biotechnology
Orbital Angular Momentum in Optics
Mechanical and Optical Resonators
Quantum, superfluid, helium dynamics
CRISPR and Genetic Engineering
Quantum optics and atomic interactions
Genetic Associations and Epidemiology
Protist diversity and phylogeny
Genetics, Bioinformatics, and Biomedical Research

University of Göttingen
2020-2025

Max Planck Institute for Multidisciplinary Sciences
2022-2025

Max Planck Institute for Biophysical Chemistry
2014-2024

Weizmann Institute of Science
2024

Seoul National University
2024

Tissue Dynamics (Israel)
2024

Max Planck Society
2006-2021

Ludwig-Maximilians-Universität München
2008-2016

Center for Integrated Protein Science Munich
2008-2016

Max Planck Institute for Developmental Biology
2004-2015

Fast, scalable generation of high‐quality protein multiple sequence alignments using Clustal Omega

OPENALEX - Publications

Fabian Sievers Andreas Wilm David Dineen Toby J. Gibson Kevin Karplus and 7 more

10.1038/msb.2011.75 article EN Molecular Systems Biology 2011-01-01

The HHpred interactive server for protein homology detection and structure prediction

OPENALEX - Publications

Johannes Söding A. Biegert Andrei N. Lupas

HHpred is a fast server for remote protein homology detection and structure prediction the first to implement pairwise comparison of profile hidden Markov models (HMMs). It allows search wide choice databases, such as PDB, SCOP, Pfam, SMART, COGs CDD. accepts single query sequence or multiple alignment input. Within only few minutes it returns results in user-friendly format similar that PSI-BLAST. Search options include local global scoring secondary similarity. can produce query-template...

10.1093/nar/gki408 article EN Nucleic Acids Research 2005-06-26

MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets

OPENALEX - Publications

Martin Steinegger Johannes Söding

10.1038/nbt.3988 article EN Nature Biotechnology 2017-10-16

Protein homology detection by HMM–HMM comparison

OPENALEX - Publications

Johannes Söding

Abstract Motivation: Protein homology detection and sequence alignment are at the basis of protein structure prediction, function prediction evolution. Results: We have generalized sequences with a profile hidden Markov model (HMM) to case pairwise HMMs. present method for detecting distant homologous relationships between proteins based on this approach. The (HHsearch) is benchmarked together BLAST, PSI-BLAST, HMMER profile–profile comparison tools PROF_SIM COMPASS, in an all-against-all...

10.1093/bioinformatics/bti125 article EN Bioinformatics 2004-11-05

A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core

OPENALEX - Publications

Lukas Zimmermann Andrew Stephens Seung‐Zin Nam David Rau Jonas M. Kübler and 5 more

10.1016/j.jmb.2017.12.007 article EN Journal of Molecular Biology 2017-12-16

HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment

OPENALEX - Publications

Michael Remmert A. Biegert Andreas Hauser Johannes Söding

10.1038/nmeth.1818 article EN Nature Methods 2011-12-25

Fast and accurate protein structure search with Foldseek

OPENALEX - Publications

Michel van Kempen Stephanie Kim Charlotte Tumescheit Milot Mirdita Jeong-Jae Lee and 3 more

Abstract As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the query against database by describing tertiary amino acid interactions within proteins as sequences over structural alphabet. decreases computation times four to five orders magnitude with 86%, 88% and 133% sensitivities Dali, TM-align CE, respectively.

10.1038/s41587-023-01773-0 article EN cc-by Nature Biotechnology 2023-05-08

HH-suite3 for fast remote homology detection and deep protein annotation

OPENALEX - Publications

Martin Steinegger Markus Meier Milot Mirdita Harald Vöhringer Stephan J. Haunsberger and 1 more

HH-suite is a widely used open source software suite for sensitive sequence similarity searches and protein fold recognition. It based on pairwise alignment of profile Hidden Markov models (HMMs), which represent multiple alignments homologous proteins.We developed single-instruction multiple-data (SIMD) vectorized implementation the Viterbi algorithm HMM introduced various other speed-ups. These accelerated search methods HHsearch by factor 4 HHblits 2 over previous version 2.0.16. HHblits3...

10.1186/s12859-019-3019-7 article EN cc-by BMC Bioinformatics 2019-09-14

Clustering huge protein sequence sets in linear time

OPENALEX - Publications

Martin Steinegger Johannes Söding

Metagenomic datasets contain billions of protein sequences that could greatly enhance large-scale functional annotation and structure prediction. Utilizing this enormous resource would require reducing its redundancy by similarity clustering. However, clustering hundreds millions is impractical using current algorithms because their runtimes scale as the input set size N times number clusters K, which typically similar order N, resulting in increase almost quadratically with N. We developed...

10.1038/s41467-018-04964-5 article EN cc-by Nature Communications 2018-06-25

Uniclust databases of clustered and deeply annotated protein sequences and alignments

OPENALEX - Publications

Milot Mirdita Lars von den Driesch Clovis Galiez María Martin Johannes Söding and 1 more

We present three clustered protein sequence databases, Uniclust90, Uniclust50, Uniclust30 and databases of multiple alignments (MSAs), Uniboost10, Uniboost20 Uniboost30, as a resource for analysis, function prediction searches. The Uniclust cluster UniProtKB sequences at the level 90%, 50% 30% pairwise identity. Uniclust90 Uniclust50 clusters showed better consistency functional annotation than those UniRef90 UniRef50, owing to an optimised clustering pipeline that runs with our MMseqs2...

10.1093/nar/gkw1081 article EN cc-by Nucleic Acids Research 2016-11-01

Protein Sequence Analysis Using the MPI Bioinformatics Toolkit

OPENALEX - Publications

Felix Gabler Seung‐Zin Nam Sebastian Till Milot Mirdita Martin Steinegger and 3 more

The MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) provides interactive access to a wide range of the best-performing bioinformatics tools and databases, including state-of-the-art protein sequence comparison methods HHblits HHpred. currently includes 35 external in-house tools, covering functionalities such as similarity searching, prediction features, classification. Due this breadth functionality, tight interconnection its constituent ease use, has become an important...

10.1002/cpbi.108 article EN cc-by Current Protocols in Bioinformatics 2020-12-01

MMseqs2 desktop and local web server app for fast, interactive sequence searches

OPENALEX - Publications

Milot Mirdita Martin Steinegger Johannes Söding

The MMseqs2 desktop and web server app facilitates interactive sequence searches through custom protein profile databases on personal workstations. By eliminating MMseqs2's runtime overhead, we reduced response times to a few seconds at sensitivities close BLAST.The is easy install for non-experts. GPLv3-licensed code, pre-built packages Windows, MacOS Linux, Docker images the application demo are available https://search.mmseqs.com.Supplementary data Bioinformatics online.

10.1093/bioinformatics/bty1057 article EN cc-by Bioinformatics 2019-01-04

Fast and accurate automatic structure prediction with HHpred

OPENALEX - Publications

Andrea Hildebrand Michael Remmert A. Biegert Johannes Söding

Abstract Automated protein structure prediction is becoming a mainstream tool for biological research. This has been fueled by steady improvements of publicly available automated servers over the last decade, in particular their ability to build good homology models an increasing number targets reliably detecting and aligning more remotely homologous templates. Here, we describe three fully versions HHpred server that participated community‐wide blind competition CASP8. What makes unique...

10.1002/prot.22499 article EN Proteins Structure Function and Bioinformatics 2009-01-01

Uniform transitions of the general RNA polymerase II transcription complex

OPENALEX - Publications

Andreas Mayer Michael Lidschreiber Matthias Siebert Kristin Leike Johannes Söding and 1 more

10.1038/nsmb.1903 article EN Nature Structural & Molecular Biology 2010-09-05

CCMpred—fast and precise prediction of protein residue–residue contacts from correlated mutations

OPENALEX - Publications

Stefan Seemayer Markus Gruber Johannes Söding

Abstract Motivation : Recent breakthroughs in protein residue–residue contact prediction have made reliable de novo of structures possible. The key was to apply statistical methods that can distinguish direct couplings between pairs columns a multiple sequence alignment from merely correlated pairs, i.e. separate indirect effects. Two classes such exist, either relying on regularized inversion the covariance matrix or pseudo-likelihood maximization (PLM). Although PLM-based offer clearly...

10.1093/bioinformatics/btu500 article EN cc-by Bioinformatics 2014-07-26

Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold

OPENALEX - Publications

Martin Steinegger Milot Mirdita Johannes Söding

10.1038/s41592-019-0437-4 article EN Nature Methods 2019-06-24

The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis

OPENALEX - Publications

Vikram Alva Seung‐Zin Nam Johannes Söding Andrei N. Lupas

The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) internally HHpred, HHblits, PCOILS). While beta version the was released 10 years ago, current production-level release has been available since 2008 serviced...

10.1093/nar/gkw348 article EN cc-by-nc Nucleic Acids Research 2016-04-29

Fast and accurate protein structure search with Foldseek

OPENALEX - Publications

Michel van Kempen Stephanie Kim Charlotte Tumescheit Milot Mirdita Jeong-Jae Lee and 3 more

As structure prediction methods are generating millions of publicly available protein structures, searching these databases is becoming a bottleneck. Foldseek aligns the query against database by describing amino acid backbone proteins as sequences over structural alphabet. decreases computation times four to five orders magnitude with 86%, 88% and 133% sensitivities DALI, TM-align CE, respectively.

10.1101/2022.02.07.479398 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-02-09

WIsH: who is the host? Predicting prokaryotic hosts from metagenomic phage contigs

OPENALEX - Publications

Clovis Galiez Matthias Siebert François Enault Jonathan Vincent Johannes Söding

WIsH predicts prokaryotic hosts of phages from their genomic sequences. It achieves 63% mean accuracy when predicting the host genus among 20 genera for 3 kbp-long phage contigs. Over best current tool, WisH shows much improved on sequences a few kbp length and runs hundreds times faster, making it suited metagenomics studies.OpenMP-parallelized GPL-licensed C ++ code available at https://github.com/soedinglab/wish.clovis.galiez@mpibpc.mpg.de or soeding@mpibpc.mpg.de.Supplementary data are...

10.1093/bioinformatics/btx383 article EN cc-by Bioinformatics 2017-07-11

A vocabulary of ancient peptides at the origin of folded proteins

OPENALEX - Publications

Vikram Alva Johannes Söding Andrei N. Lupas

The seemingly limitless diversity of proteins in nature arose from only a few thousand domain prototypes, but the origin these themselves has remained unclear. We are pursuing hypothesis that they by fusion and accretion an ancestral set peptides active as co-factors RNA-dependent replication catalysis. Should this be true, contemporary domains may still contain vestiges such peptides, which could reconstructed comparative approach same way ancient vocabularies have been study modern...

10.7554/elife.09410 article EN cc-by eLife 2015-12-14

Lysine/RNA-interactions drive and regulate biomolecular condensation

OPENALEX - Publications

Tina Ukmar Saskia Hutten Matthew P. Grieshop Nasrollah Rezaei‐Ghaleh Maria‐Sol Cima‐Omori and 5 more

Abstract Cells form and use biomolecular condensates to execute biochemical reactions. The molecular properties of non-membrane-bound are directly connected the amino acid content disordered protein regions. Lysine plays an important role in cellular function, but little is known about its condensation. Here we show that disorder abundant protein/RNA granules lysine enriched regions proteins P-bodies compared entire human proteome. Lysine-rich polypeptides phase separate into...

10.1038/s41467-019-10792-y article EN cc-by Nature Communications 2019-07-02

MMseqs software suite for fast and deep clustering and searching of large protein sequence sets

OPENALEX - Publications

Maria Hauser Martin Steinegger Johannes Söding

Abstract Motivation: Sequence databases are growing fast, challenging existing analysis pipelines. Reducing the redundancy of sequence by similarity clustering improves speed and sensitivity iterative searches. But tools cannot efficiently cluster size UniProt to 50% maximum pairwise identity or below. Furthermore, in metagenomics experiments typically large fractions reads be matched any known anymore because searching with sensitive but relatively slow (e.g. BLAST HMMER3) through...

10.1093/bioinformatics/btw006 article EN Bioinformatics 2016-01-06

MetaEuk—sensitive, high-throughput gene discovery, and annotation for large-scale eukaryotic metagenomics

OPENALEX - Publications

Eli Levy Karin Milot Mirdita Johannes Söding

Metagenomics is revolutionizing the study of microorganisms and their involvement in biological, biomedical, geochemical processes, allowing us to investigate by direct sequencing a tremendous diversity organisms without need for prior cultivation. Unicellular eukaryotes play essential roles most microbial communities as chief predators, decomposers, phototrophs, bacterial hosts, symbionts, parasites plants animals. Investigating therefore great interest ecology, biotechnology, human health,...

10.1186/s40168-020-00808-x article EN cc-by Microbiome 2020-04-03

Fast and sensitive taxonomic assignment to metagenomic contigs

OPENALEX - Publications

Milot Mirdita Martin Steinegger Florian P. Breitwieser Johannes Söding Eli Levy Karin

MMseqs2 taxonomy is a new tool to assign taxonomic labels metagenomic contigs. It extracts all possible protein fragments from each contig, quickly retains those that can contribute annotation, assigns them with robust and determines the contig's identity by weighted voting. Its fragment extraction step suitable for analysis of domains life. 2-18× faster than state-of-the-art tools also contains modules creating manipulating reference databases as well reporting visualizing...

10.1093/bioinformatics/btab184 article EN cc-by Bioinformatics 2021-03-16

RECQL5 Controls Transcript Elongation and Suppresses Genome Instability Associated with Transcription Stress

OPENALEX - Publications

Marco Saponaro Theodoros Kantidakis Richard Mitter Gavin Kelly B. Mark Heron and 4 more

RECQL5 is the sole member of RECQ family helicases associated with RNA polymerase II (RNAPII). We now show that a general elongation factor important for preserving genome stability during transcription. Depletion or overexpression results in corresponding shifts genome-wide RNAPII density profile. Elongation particularly affected, depletion causing striking increase average rate, concurrent increased stalling, pausing, arrest, and/or backtracking (transcription stress). therefore controls...

10.1016/j.cell.2014.03.048 article EN cc-by Cell 2014-05-01

Coming Soon ...