Felix Teufel

ORCID: 0000-0003-1275-8065
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Machine Learning in Bioinformatics
  • RNA and protein synthesis mechanisms
  • Genomics and Phylogenetic Studies
  • Antimicrobial Peptides and Activities
  • Chemical Synthesis and Analysis
  • Evolutionary Algorithms and Applications
  • vaccines and immunoinformatics approaches
  • Natural Language Processing Techniques
  • Microbial Inactivation Methods
  • Gaussian Processes and Bayesian Inference
  • Algorithms and Data Compression
  • Advanced Proteomics Techniques and Applications
  • Receptor Mechanisms and Signaling
  • Food Drying and Modeling
  • Protein Structure and Dynamics
  • Magnetic and Electromagnetic Effects

University of Copenhagen
2022-2024

Novo Nordisk (Denmark)
2022-2024

Digital Science (United States)
2024

Center for Systems Biology
2024

Harvard University
2024

Novo Nordisk (United Kingdom)
2024

Technical University of Denmark
2021-2022

ETH Zurich
2021

BOKU University
2018

Abstract Signal peptides (SPs) are short amino acid sequences that control protein secretion and translocation in all living organisms. SPs can be predicted from sequence data, but existing algorithms unable to detect known types of SPs. We introduce SignalP 6.0, a machine learning model detects five SP is applicable metagenomic data.

10.1038/s41587-021-01156-3 article EN cc-by Nature Biotechnology 2022-01-03

DeepLoc 2.0 is a popular web server for the prediction of protein subcellular localization and sorting signals. Here, we introduce 2.1, which additionally classifies input proteins into membrane types Transmembrane, Peripheral, Lipid-anchored Soluble. Leveraging pre-trained transformer-based language models, utilizes three-stage architecture sequence-based, multi-label predictions. Comparative evaluations with other established tools on test set 4933 eukaryotic sequences, constructed...

10.1093/nar/gkae237 article EN cc-by Nucleic Acids Research 2024-04-08

10.1007/978-1-0716-4007-4_17 article EN Methods in molecular biology 2024-01-01

Abstract Protein subcellular location prediction is a widely explored task in bioinformatics because of its importance proteomics research. We propose DeepLocPro, an extension to the popular method DeepLoc, tailored specifically archaeal and bacterial organisms. DeepLocPro multiclass tool for prokaryotic proteins, trained on experimentally verified data curated from UniProt PSORTdb. compares favorably PSORTb 3.0 ensemble method, surpassing performance across multiple metrics our benchmark...

10.1101/2024.01.04.574157 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-01-04

Abstract Peptides play important roles in regulating biological processes and form the basis of a multiplicity therapeutic drugs. To date, only about 300 peptides human have confirmed bioactivity, although tens thousands been reported literature. The majority these are inactive degradation products endogenous proteins peptides, presenting needle-in-a-haystack problem identifying most promising candidate from large-scale peptidomics experiments to test for bioactivity. address this challenge,...

10.1038/s41467-022-34031-z article EN cc-by Nature Communications 2022-10-20

When splitting biological sequence data for the development and testing of predictive models, it is necessary to avoid too-closely related pairs sequences ending up in different partitions. If this ignored, performance prediction methods will tend be overestimated. Several algorithms have been proposed homology reduction, where are removed until no remain. We present GraphPart, an algorithm partitioning that divides such closely always end same partition, while keeping as many possible...

10.1093/nargab/lqad088 article EN cc-by NAR Genomics and Bioinformatics 2023-10-11

Abstract Motivation Protein subcellular location prediction is a widely explored task in bioinformatics because of its importance proteomics research. We propose DeepLocPro, an extension to the popular method DeepLoc, tailored specifically archaeal and bacterial organisms. Results DeepLocPro multiclass tool for prokaryotic proteins, trained on experimentally verified data curated from UniProt PSORTdb. compares favorably PSORTb 3.0 ensemble method, surpassing performance across multiple...

10.1093/bioinformatics/btae677 article EN cc-by Bioinformatics 2024-11-14

Abstract Signal peptides (SPs) are short amino acid sequences that control protein secretion and translocation in all living organisms. As experimental characterization of SPs is costly, prediction algorithms applied to predict them from sequence data. However, existing methods unable detect known types SPs. We introduce SignalP 6.0, the first model capable detecting five SP types. Additionally, accurately identifies positions regions within SPs, revealing defining biochemical properties...

10.1101/2021.06.09.447770 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2021-06-10

Abstract Background AlphaFold’s accuracy, which is often comparable to that of experimentally determined structures, has revolutionized protein structure research. Being a statistical method, AlphaFold implicitly infers the cellular environment, e.g. cell membrane, from sequence. Membrane topology prediction methods predict environment for each residue but not structure. Current and tools thus provide complementary information. Results We introduce web server MembraneFold. MembraneFold...

10.1101/2022.12.06.518085 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2022-12-08

The genome sequence contains the blueprint for governing cellular processes. While availability of genomes has vastly increased over last decades, experimental annotation various functional, non-coding and regulatory elements encoded in DNA remains both expensive challenging. This sparked interest unsupervised language modeling genomic DNA, a paradigm that seen great success protein data. Although models have been proposed, evaluation tasks often differ between individual works, might not...

10.48550/arxiv.2311.12570 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Abstract Motivation Peptides are ubiquitous throughout life and involved in a wide range of biological processes, ranging from neural signaling higher organisms to antimicrobial peptides bacteria. Many generated post-translationally by cleavage precursor proteins can thus not be detected directly genomics data, as the specificities responsible proteases often completely understood. Results We present DeepPeptide, deep learning model that predicts cleaved amino acid sequence. DeepPeptide...

10.1093/bioinformatics/btad616 article EN cc-by Bioinformatics 2023-10-01

Abstract Many secreted endogenous peptides rely on signalling pathways to exert their function in the body. While can be discovered through high throughput technologies, cognate receptors typically cannot, hindering understanding of mode action. We investigate use AlphaFold-Multimer for identifying human receptor libraries without any prior knowledge about likely candidates. find that AlphaFold’s predicted confidence metrics have strong performance prioritizing true peptide-receptor...

10.1101/2022.10.28.514036 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-10-31

Abstract When splitting biological sequence data for the development and testing of predictive models, it is necessary to avoid too closely related pairs sequences ending up in different partitions. If this ignored, performance estimates prediction methods will tend be exaggerated. Several algorithms have been proposed homology reduction, where are removed until no remain. We present GraphPart, an algorithm partitioning, as many possible kept dataset, but partitions defined such that always...

10.1101/2023.04.14.536886 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-04-17

Bayesian optimization (BO) is an attractive machine learning framework for performing sample-efficient global of black-box functions. The process guided by acquisition function that selects points to acquire in each round BO. In batched BO, when multiple are acquired parallel, commonly used functions often high-dimensional and intractable, leading the use sampling-based alternatives. We propose a statistical physics inspired BO with Gaussian processes can natively handle batches. Batched...

10.48550/arxiv.2410.08804 preprint EN arXiv (Cornell University) 2024-10-11

Abstract Genetic studies reveal extensive disease-associated variation across the human genome, predominantly in noncoding regions, such as promoters. Quantifying impact of these variants on disease risk is crucial to our understanding underlying mechanisms and advancing personalized medicine. However, current computational methods struggle capture variant effects, particularly those insertions deletions (indels), which can significantly disrupt gene expression. To address this challenge, we...

10.1101/2024.11.11.623015 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-11-12
Coming Soon ...