- Protein Structure and Dynamics
- Machine Learning in Bioinformatics
- Genomics and Phylogenetic Studies
- RNA and protein synthesis mechanisms
- Enzyme Structure and Function
- Bioinformatics and Genomic Networks
- Advanced Proteomics Techniques and Applications
- RNA Research and Splicing
- RNA modifications and cancer
- Mass Spectrometry Techniques and Applications
- Cell Image Analysis Techniques
- Molecular spectroscopy and chirality
- Machine Learning in Materials Science
- Computational Drug Discovery Methods
- Microbial Metabolic Engineering and Bioproduction
- Glycosylation and Glycoproteins Research
- Heat shock proteins research
- Single-cell and spatial transcriptomics
- Galectins and Cancer Biology
- Algorithms and Data Compression
- Genomics and Rare Diseases
- Peptidase Inhibition and Analysis
- Amyotrophic Lateral Sclerosis Research
- Alzheimer's disease research and treatments
- Metabolomics and Mass Spectrometry Studies
VIB-KU Leuven Center for Brain & Disease Research
2020-2025
KU Leuven
2020-2025
Vrije Universiteit Brussel
2014-2022
Université Libre de Bruxelles
2017-2018
VIB-VUB Center for Structural Biology
2015-2017
Intrinsically disordered proteins, defying the traditional protein structure-function paradigm, are a challenge to study experimentally. Because large part of our knowledge rests on computational predictions, it is crucial that their accuracy high. The Critical Assessment Intrinsic Disorder prediction (CAID) experiment was established as community-based blind test determine state art in intrinsically regions and subset residues involved binding. A total 43 methods were evaluated dataset 646...
High-throughput sequencing methods are generating enormous amounts of genomic data, giving unprecedented insights into human genetic variation and its relation to disease. An individual genome contains millions Single Nucleotide Variants: discriminate the deleterious from benign ones, a variety have been developed that predict whether protein-coding variant likely affects carrier individual's health. We present such method, DEOGEN2, which incorporates heterogeneous information about...
Disulfide bonds are crucial for many structural and functional aspects of proteins. They have a stabilizing role during folding, can regulate enzymatic activity trigger allosteric changes in the protein structure. Moreover, knowledge topology disulfide connectivity be relevant genomic annotation tasks provide long range constraints ab-initio structure predictors. In this paper we describe PhyloCys, novel unsupervised predictor bond from known cysteine oxidation states. For each query...
Abstract The amyloid conformation can be adopted by a variety of sequences, but the precise boundaries sequence space are still unclear. currently charted is strongly biased towards hydrophobic, beta-sheet prone sequences that form core globular proteins and Q/N/Y rich yeast prions. Here, we took advantage increasing amount high-resolution structural information on cores available in protein databank to implement machine learning approach, named Cordax ( https://cordax.switchlab.org ),...
Abstract Developing antibodies is complex and resource-intensive, methods for designing targeting specific epitopes are lacking. We introduce a de novo antibody design approach leveraging the empirical force field FoldX to complementarity determining regions (CDRs). Starting from scaffold VHH, we tackled three challenges of increasing difficulty: 1) CDRs optimize VHH stability affinity its original target; 2) high human ortholog; 3) low nanomolar pre-defined epitope on unrelated...
Eukaryotic cells contain different membrane-delimited compartments, which are crucial for the biochemical reactions necessary to sustain cell life. Recent studies showed that can also trigger formation of membraneless organelles composed by phase-separated proteins respond various stimuli. These condensates provide new ways control and phase-separation (PSPs) thus revolutionizing how cellular organization is conceived. The small number experimentally validated proteins, difficulty in...
Protein solubility is a key aspect for many biotechnological, biomedical and industrial processes, such as the production of active proteins antibodies. In addition, understanding molecular determinants may be crucial to shed light on mechanisms diseases caused by aggregation processes amyloidosis. Here we present SKADE, novel Neural Network protein predictor show how it can provide insight into mechanisms, thanks its neural attention architecture. First, that SKADE positively compares with...
The FoldX force field was originally validated with a database of 1000 mutants at time when there were few high-resolution structures. Here we have manually curated 5556 affecting protein stability, resulting in 2484 highly confident mutations denominated Stability Dataset (FSD), represented non-redundant X-ray structures less than 2.5 Å resolution, not involving duplicates, metals or prosthetic groups. Using this database, created new version the by introducing Pi stacking, pH dependency...
Abstract Protein folding is a complex process that can lead to disease when it fails. Especially poorly understood are the very early stages of protein folding, which likely defined by intrinsic local interactions between amino acids close each other in sequence. We here present EFoldMine, method predicts, from primary acid sequence protein, involved events. The based on data hydrogen deuterium exchange (HDX) NMR pulsed labelling experiments, and uses backbone sidechain dynamics as well...
Abstract Structural bioinformatics suffers from the lack of interfaces connecting biological structures and machine learning methods, making application modern neural network architectures impractical. This negatively affects development structure-based causing a bottleneck in research. Here we present PyUUL ( https://pyuul.readthedocs.io/ ), library to translate into 3D tensors, allowing an out-of-the-box state-of-the-art deep algorithms. The converts macromolecules data typical computer...
Next Generation Sequencing is dramatically increasing the number of known protein sequences, with related experimentally determined structures lagging behind. Structural bioinformatics attempting to close this gap by developing approaches that predict structure-level characteristics for uncharacterized most developed methods relying heavily on evolutionary information collected from homologous sequences. Here we show there a substantial observational selection bias in approach: predictions...
Abstract Motivation Protein beta-aggregation is an important but poorly understood phenomena involved in diseases as well beneficial physiological processes. However, while this task has been investigated for over 50 years, very little known about its mechanisms of action. Moreover, the identification regions aggregation still open problem and state-of-the-art methods are often inadequate real case applications. Results In article we present AgMata, unsupervised tool such from amino acidic...
Abstract Machine learning (ML) is ubiquitous in bioinformatics, due to its versatility. One of the most crucial aspects consider while training a ML model carefully select optimal feature encoding for problem at hand. Biophysical propensity scales are widely adopted structural bioinformatics because they describe amino acids properties that intuitively relevant many and functional proteins, thus commonly used as input features methods. In this paper we reproduce three classical prediction...
Abstract Motivation Proteins able to undergo liquid–liquid phase separation (LLPS) in vivo and vitro are drawing a lot of interest, due their functional relevance for cell life. Nevertheless, the proteome-scale experimental screening these proteins seems unfeasible, because besides being expensive time-consuming, LLPS is heavily influenced by multiple environmental conditions such as concentration, pH temperature, thus requiring combinatorial number experiments each protein. Results To...
Deep learning algorithms applied to structural biology often struggle converge meaningful solutions when limited data is available, since they are required learn complex physical rules from examples. State-of-the-art force-fields, however, cannot interface with deep due their implementation.
Protein dynamics and related conformational changes are essential for their function but difficult to characterise interpret. Amino acids in a protein behave according local energy landscape, which is determined by structural context environmental conditions. The lowest state given residue can correspond sharply defined conformations, e.g. stable helix, or cover wide range of intrinsically disordered regions. A good definition such low states therefore important describe the behaviour how it...
Abstract Motivation Evolutionary information is crucial for the annotation of proteins in bioinformatics. The amount retrieved homologs often correlates with quality predicted protein annotations related to structure or function. With a growing sequences available, fast and reliable methods homology detection are essential, as they have direct impact on annotations. Results We developed discriminative, alignment-free algorithm quasi-linear complexity, enabling theoretically much faster...
We provide integrated protein sequence-based predictions via https://bio2byte.be/b2btools/. The aim of our is to identify the biophysical behaviour or features proteins that are not readily captured by structural biology and/or molecular dynamics approaches. Upload a FASTA file text input sequence provides from DynaMine backbone and side-chain dynamics, conformational propensities, derived EFoldMine early folding, DisoMine disorder, Agmata β-sheet aggregation. These predictions, several...
Methods able to provide reliable protein alignments are crucial for many bioinformatics applications. In the last years different algorithms have been developed and various kinds of information, from sequence conservation secondary structure, used improve alignment performances. This is especially relevant proteins with highly divergent sequences. However, recent works suggest that features may importance in diverse classes it would be an advantage more customizable approaches, capable deal...
Abstract Motivation: Cysteine residues have particular structural and functional relevance in proteins because of their ability to form covalent disulfide bonds. Bioinformatics tools that can accurately predict cysteine bonding states are already available, whereas it remains challenging infer the connectivity pattern unknown protein sequences. Improving accuracy this area is highly relevant for annotation proteins. Results: We intra-chain bond patterns starting from known with an...
Abstract The role of intrinsically disordered protein regions (IDRs) in cellular processes has become increasingly evident over the last years. These IDRs continue to challenge structural biology experiments because they lack a well-defined conformation, and bioinformatics approaches that accurately delineate remain essential for their identification further investigation. Typically, these predictors use only amino acid sequence, without taking into account likely emergent properties are...
Abstract Next generation sequencing technologies are providing increasing amounts of data, paving the way for improvements in clinical genetics and precision medicine. The interpretation observed genomic variants light their phenotypic effects is thus emerging as a crucial task to solve order advance our understanding how exomic affect proteins proteins’ functional changes human health. Since experimental evaluation every variant unfeasible, Bioinformatics methods being developed address...
Chemical shifts (CS) are determined from NMR experiments and represent the resonance frequency of spin atoms in a magnetic field. They contain mixture information, encompassing in-solution conformations protein adopts, as well movements it performs. Due to their intrinsically multi-faceted nature, CS difficult interpret visualize. Classical approaches for analysis aim extract specific protein-related properties, thus discarding large amount information that cannot be directly linked...