NFDI4DS | UHH-SEMS - Publication Details

Unsupervised protein embeddings outperform hand-crafted sequence and structure features at predicting molecular function

OPENALEX - Publications

Amelia Villegas-Morcillo Stavros Makrodimitris Roeland C. H. J. van Ham Ángel M. Gómez Victoria A. Sanchez and 1 more

Protein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict from these. Deep supervised models require lot of labeled training data which are not available for this task. However, very large amount protein sequences without functional labels available.We applied an existing model that had been pretrained in unsupervised setting on the task molecular prediction. We found feature...

10.1093/bioinformatics/btaa701 article EN cc-by-nc Bioinformatics 2020-08-12

From Background to Signal: Simultaneously derived Copy Numbers from Methylation-Dependent Sequencing cell-free DNA

OPENALEX - Publications

Daan Hazelaar Ruben Boers Joachim Boers Vanja de Weerd Maurice Jansen and 6 more

Cell-free DNA (cfDNA) analysis offers a powerful, non-invasive approach to cancer diagnostics and monitoring by revealing tumor-specific genomic epigenetic alterations. Here, we demonstrate the versatility of MeD-seq, methylation-dependent sequencing assay, for comprehensive cfDNA analysis, including methylation profiling, chromosomal copy number (CN) alterations, tumor fraction (TF) estimation. MeD-seq-derived CN profiles TF estimates from 38 colorectal with liver metastases (CRLM) 5...

10.1101/2025.01.21.633371 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2025-01-24

Circulating Tumour Cells & Circulating Tumour DNA in Patients with Resectable Colorectal Liver Metastases – A Prospective Cohort Study (the MIRACLE)

OPENALEX - Publications

Lissa Wullaert Maurice P.H.M. Jansen Jaco Kraan Yannick Meyer Kelly R. Voigt and 13 more

10.2139/ssrn.5181693 preprint EN 2025-01-01

Improving protein function prediction using protein sequence and GO-term similarities

OPENALEX - Publications

Stavros Makrodimitris Roeland C. H. J. van Ham Marcel J. T. Reinders

Abstract Motivation Most automatic functional annotation methods assign Gene Ontology (GO) terms to proteins based on annotations of highly similar proteins. We advocate that are less still informative. Also, despite their simplicity and structure, GO seem be hard for computers learn, in particular the Biological Process ontology, which has most (&gt;29 000). propose use Label-Space Dimensionality Reduction (LSDR) techniques exploit redundancy transform them into a more compact latent...

10.1093/bioinformatics/bty751 article EN cc-by-nc Bioinformatics 2018-08-29

Benchmarking variational AutoEncoders on cancer transcriptomics data

OPENALEX - Publications

Mostafa Eltager Tamim Abdelaal Mohammed Charrout Ahmed Mahfouz Marcel J. T. Reinders and 1 more

Deep generative models, such as variational autoencoders (VAE), have gained increasing attention in computational biology due to their ability capture complex data manifolds which subsequently can be used achieve better performance downstream tasks, cancer type prediction or subtyping of cancer. However, these models are difficult train the large number hyperparameters that need tuned. To get a understanding importance different hyperparameters, we examined six VAE when trained on TCGA...

10.1371/journal.pone.0292126 article EN cc-by PLoS ONE 2023-10-05

A novel computerized tool to stratify risk in carotid atherosclerosis using kinematic features of the arterial wall

OPENALEX - Publications

Aimilia Gastounioti Stavros Makrodimitris Spyretta Golemati Nikolaos P.E. Kadoglou Christos D. Liapis and 1 more

Valid characterization of carotid atherosclerosis (CA) is a crucial public health issue, which would limit the major risks held by CA for both patient safety and state economies.This paper investigated unexplored potential kinematic features in assisting diagnostic decision framework computer-aided diagnosis (CAD) tool.To this end, 15 CAD schemes were designed fed with wide variety atherosclerotic plaque arterial wall adjacent to 56 patients from two different hospitals.The benchmarked terms...

10.1109/jbhi.2014.2329604 article EN IEEE Journal of Biomedical and Health Informatics 2014-01-01

Epigenetic and Genomic Hallmarks of PARP-Inhibitor Resistance in Ovarian Cancer Patients

OPENALEX - Publications

Tugce Senturk Kirmizitas C van den Berg Ruben Boers Jean C. A. Helmijr Stavros Makrodimitris and 15 more

Background: Patients with advanced-stage epithelial ovarian cancer (EOC) receive treatment a poly-ADP ribose-polymerase (PARP) inhibitor (PARPi) as maintenance therapy after surgery and chemotherapy. Unfortunately, many patients experience disease progression because of acquired resistance. This study aims to characterize epigenetic genomic changes in cell-free DNA (cfDNA) associated PARPi Materials Methods: Blood was taken from 31 EOC receiving before at during/after treatment. Resistance...

10.3390/genes15060750 article EN Genes 2024-06-07

Cell type deconvolution of methylated cell-free DNA at the resolution of individual reads

OPENALEX - Publications

Pia Keukeleire Stavros Makrodimitris Marcel J. T. Reinders

Cell-free DNA (cfDNA) are fragments originating from dying cells that detectable in bodily fluids, such as the plasma. Accelerated cell death, for example caused by disease, induces an elevated concentration of cfDNA. As a result, determining type origins cfDNA molecules can provide information about individual's health. In this work, we aim to increase sensitivity methylation-based deconvolution adapting existing method, CelFiE, which uses methylation beta values individual CpG sites...

10.1093/nargab/lqad048 article EN cc-by NAR Genomics and Bioinformatics 2022-06-01

Metric learning on expression data for gene function prediction

OPENALEX - Publications

Stavros Makrodimitris Marcel J. T. Reinders Roeland C. H. J. van Ham

Co-expression of two genes across different conditions is indicative their involvement in the same biological process. However, when using RNA-Seq datasets with many experimental from diverse sources, only a subset expected to be relevant for finding related particular Gene Ontology (GO) term. Therefore, we hypothesize that purpose find similarly functioning genes, co-expression should not determined on all samples but those informative GO term interest.To address this, developed Metric...

10.1093/bioinformatics/btz731 article EN cc-by Bioinformatics 2019-09-26

Dynamic clonal hematopoiesis and functional T-cell immunity in a supercentenarian

OPENALEX - Publications

Erik B. van den Akker Stavros Makrodimitris Marc Hulsman Martijn H. Brugman Tatjana Nikolić and 9 more

10.1038/s41375-020-01086-0 article EN cc-by Leukemia 2020-11-12

Is Wikipedia succeeding in reducing gender bias? Assessing changes in gender bias in Wikipedia using word embeddings

OPENALEX - Publications

Katja Geertruida Schmahl Tom J. Viering Stavros Makrodimitris Arman Naseri Jahfari David M. J. Tax and 1 more

Katja Geertruida Schmahl, Tom Julian Viering, Stavros Makrodimitris, Arman Naseri Jahfari, David Tax, Marco Loog. Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science. 2020.

10.18653/v1/2020.nlpcss-1.11 article EN cc-by 2020-01-01

An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics

OPENALEX - Publications

Stavros Makrodimitris Bram Pronk Tamim Abdelaal Marcel J. T. Reinders

Multi-omic analyses are necessary to understand the complex biological processes taking place at tissue and cell level, but also make reliable predictions about, for example, disease outcome. Several linear methods exist that create a joint embedding using paired information per sample, recently there has been rise in popularity of neural architectures embed -omics into same non-linear manifold. This work describes head-to-head comparison both bulk single-cell multi-modal datasets. We found...

10.1093/bib/bbad416 article EN cc-by Briefings in Bioinformatics 2023-11-22

Benchmarking Variational AutoEncoders on cancer transcriptomics data

OPENALEX - Publications

Mostafa Eltager Tamim Abdelaal Mohammed Charrout Ahmed Mahfouz Marcel J. T. Reinders and 1 more

Abstract Deep generative models, such as variational autoencoders (VAE), have gained increasing attention in computational biology due to their ability capture complex data manifolds which subsequently can be used achieve better performance downstream tasks, cancer type prediction or subtyping of cancer. However, these models are difficult train the large number hyperparameters that need tuned. To get a understanding importance different hyperparameters, we examined six VAE when trained on...

10.1101/2023.02.09.527832 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-02-10

Unsupervised protein embeddings outperform hand-crafted sequence and structure features at predicting molecular function

OPENALEX - Publications

Amelia Villegas-Morcillo Stavros Makrodimitris Roeland C. H. J. van Ham Ángel M. Gómez Victoria Sánchez and 1 more

Abstract Motivation Protein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict from these. Deep supervised models require lot of labeled training data which are not available for this task. However, very large amount protein sequences without functional labels available. Results We applied an existing model that had been pre-trained in unsupervised setting on the task prediction. found...

10.1101/2020.04.07.028373 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-04-08

The Power of Universal Contextualized Protein Embeddings in Cross-species Protein Function Prediction

OPENALEX - Publications

Irene van den Bent Stavros Makrodimitris Marcel J. T. Reinders

Computationally annotating proteins with a molecular function is difficult problem that made even harder due to the limited amount of available labeled protein training data. Unsupervised embeddings partly circumvent this limitation by learning universal representation from many unlabeled sequences. Such incorporate contextual information amino acids, thereby modeling underlying principles sequences insensitive context species. We used an existing pre-trained embedding method and subjected...

10.1177/11769343211062608 article EN cc-by-nc Evolutionary Bioinformatics 2021-01-01

An in-depth comparison of linear and non-linear joint embedding methods for bulk and single-cell multi-omics

OPENALEX - Publications

Stavros Makrodimitris Bram Pronk Tamim Abdelaal Marcel J. T. Reinders

Abstract Multi-omic analyses contribute to understanding complex biological processes, but also making reliable predictions about, for example, disease outcomes. Several linear joint dimensionality reduction methods exist, recently neural networks are more commonly used embed different-omics into the same non-linear manifold. We compared embedding using bulk and single-cell data. For modality imputation, had a clear advantage. Comparisons in downstream supervised tasks lead following...

10.1101/2023.04.10.535672 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-04-11

Machine learning-based somatic variant calling in cell-free DNA of metastatic breast cancer patients using large NGS panels

OPENALEX - Publications

Elisabeth M. Jongbloed Maurice P.H.M. Jansen Vanja de Weerd Jean A. Helmijr Corine M. Beaufort and 9 more

Abstract Next generation sequencing of cell-free DNA (cfDNA) is a promising method for treatment monitoring and therapy selection in metastatic breast cancer (MBC). However, distinguishing tumor-specific variants from artefacts germline variation with low false discovery rate challenging when using large targeted panels covering many tumor suppressor genes. To address this, we built machine learning model to remove positive variant calls augmented it additional filters ensure tumor-derived...

10.1038/s41598-023-37409-1 article EN cc-by Scientific Reports 2023-06-27

Metric Learning on Expression Data for Gene Function Prediction

OPENALEX - Publications

Stavros Makrodimitris Marcel J. T. Reinders Roeland C. H. J. van Ham

Abstract Motivation Co-expression of two genes across different conditions is indicative their involvement in the same biological process. However, using RNA-Seq datasets with many experimental from diverse sources introduces batch effects and other artefacts that might obscure real co-expression signal. Moreover, only a subset expected to be relevant for finding related particular Gene Ontology (GO) term. Therefore, we hypothesize when purpose find similar functioning should not determined...

10.1101/651042 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2019-05-27

A thorough analysis of the contribution of experimental, derived and sequence-based predicted protein-protein interactions for functional annotation of proteins

OPENALEX - Publications

Stavros Makrodimitris Marcel J. T. Reinders Roeland C. H. J. van Ham

Physical interaction between two proteins is strong evidence that the are involved in same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting cellular functions of proteins. However, PPI largely incomplete non-model species. Here, we tested to what extent these still useful genome-wide function prediction. We used network-based classifiers predict Biological Process Gene Ontology terms from protein four species: Saccharomyces...

10.1371/journal.pone.0242723 article EN cc-by PLoS ONE 2020-11-25

A thorough analysis of the contribution of experimental, derived and sequence-based predicted protein-protein interactions for functional annotation of proteins

OPENALEX - Publications

Stavros Makrodimitris Marcel J. T. Reinders Roeland C. H. J. van Ham

Abstract Physical interaction between two proteins is strong evidence that the are involved in same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting cellular functions of proteins. However, PPI largely incomplete non-model species. Here, we tested to what extened these still useful genome-wide function prediction. We used network-based classifiers predict Biological Process Gene Ontology terms from protein four species:...

10.1101/832253 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-11-07

Dynamic Clonal Hematopoiesis and Functional T-cell Immunity in a Super-centenarian

OPENALEX - Publications

Erik B. van den Akker Stavros Makrodimitris Marc Hulsman Martijn H. Brugman Tatjana Nikolić and 9 more

Abstract The aged hematopoietic system is characterized by decreased immuno-competence and a reduced number of stem cells (HSCs) that actively generates new blood cell (age-related clonal hematopoiesis, ARCH). While both aspects are commonly associated with an increased risk aging-related diseases, it currently unknown to what extent these co-occur during exceptional longevity. Here, we investigated in immuno-hematopoietically normal female who reached 111 years. Blood samples were collected...

10.1101/788752 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-10-01

The power of universal contextualised protein embeddings in cross-species protein function prediction

OPENALEX - Publications

Irene van den Bent Stavros Makrodimitris Marcel J. T. Reinders

Abstract Computationally annotating proteins with a molecular function is difficult problem that made even harder due to the limited amount of available labelled protein training data. A recently published supervised predicting model partly circumvents this limitation by making its predictions based on universal (i.e. task-agnostic) contextualised embeddings from deep pre-trained unsupervised language SeqVec. SeqVec incorporate contextual information amino acids, thereby modelling underlying...

10.1101/2021.04.19.440461 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-04-19

Machine learning-based somatic variant calling in cell-free DNA of metastatic breast cancer patients using large NGS panels

OPENALEX - Publications

Elisabeth M. Jongbloed Maurice P.H.M. Jansen Vanja de Weerd Jean A. Helmijr Corine M. Beaufort and 9 more

Abstract Next generation sequencing of cell-free DNA (cfDNA) is a promising method for treatment monitoring and therapy selection in metastatic breast cancer (MBC). However, distinguishing tumor-specific variants from artefacts germline variation with low false discovery rate challenging when using large targeted panels covering many tumor suppressor genes. To address this, we built machine learning model to remove positive variant calls augmented it additional filters ensure tumor-derived...

10.21203/rs.3.rs-2742846/v1 preprint EN cc-by Research Square (Research Square) 2023-04-06