- Cancer Genomics and Diagnostics
- Machine Learning in Bioinformatics
- Single-cell and spatial transcriptomics
- Bioinformatics and Genomic Networks
- Gene expression and cancer classification
- RNA and protein synthesis mechanisms
- Genomics and Phylogenetic Studies
- Protein Structure and Dynamics
- Acute Myeloid Leukemia Research
- Genetic factors in colorectal cancer
- Hematopoietic Stem Cell Transplantation
- Epigenetics and DNA Methylation
- BRCA gene mutations in cancer
- Biomedical Text Mining and Ontologies
- Neutrophil, Myeloperoxidase and Oxidative Mechanisms
- Ovarian cancer diagnosis and treatment
- PARP inhibition in cancer therapy
- Cancer Cells and Metastasis
- Cardiovascular Health and Disease Prevention
- Hate Speech and Cyberbullying Detection
- Hemoglobinopathies and Related Disorders
- Cell Image Analysis Techniques
- Fractal and DNA sequence analysis
- Cardiovascular Disease and Adiposity
- Cerebrovascular and Carotid Artery Diseases
Erasmus MC
2023-2024
Delft University of Technology
2018-2023
Erasmus MC Cancer Institute
2023
Cancer Genomics Centre
2022
KeyGene (Netherlands)
2018-2021
Vrije Universiteit Amsterdam
2019
Amsterdam Neuroscience
2019
Agro Business Park
2019
National Technical University of Athens
2014
Protein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict from these. Deep supervised models require lot of labeled training data which are not available for this task. However, very large amount protein sequences without functional labels available.We applied an existing model that had been pretrained in unsupervised setting on the task molecular prediction. We found feature...
Cell-free DNA (cfDNA) analysis offers a powerful, non-invasive approach to cancer diagnostics and monitoring by revealing tumor-specific genomic epigenetic alterations. Here, we demonstrate the versatility of MeD-seq, methylation-dependent sequencing assay, for comprehensive cfDNA analysis, including methylation profiling, chromosomal copy number (CN) alterations, tumor fraction (TF) estimation. MeD-seq-derived CN profiles TF estimates from 38 colorectal with liver metastases (CRLM) 5...
Abstract Motivation Most automatic functional annotation methods assign Gene Ontology (GO) terms to proteins based on annotations of highly similar proteins. We advocate that are less still informative. Also, despite their simplicity and structure, GO seem be hard for computers learn, in particular the Biological Process ontology, which has most (>29 000). propose use Label-Space Dimensionality Reduction (LSDR) techniques exploit redundancy transform them into a more compact latent...
Deep generative models, such as variational autoencoders (VAE), have gained increasing attention in computational biology due to their ability capture complex data manifolds which subsequently can be used achieve better performance downstream tasks, cancer type prediction or subtyping of cancer. However, these models are difficult train the large number hyperparameters that need tuned. To get a understanding importance different hyperparameters, we examined six VAE when trained on TCGA...
Valid characterization of carotid atherosclerosis (CA) is a crucial public health issue, which would limit the major risks held by CA for both patient safety and state economies.This paper investigated unexplored potential kinematic features in assisting diagnostic decision framework computer-aided diagnosis (CAD) tool.To this end, 15 CAD schemes were designed fed with wide variety atherosclerotic plaque arterial wall adjacent to 56 patients from two different hospitals.The benchmarked terms...
Background: Patients with advanced-stage epithelial ovarian cancer (EOC) receive treatment a poly-ADP ribose-polymerase (PARP) inhibitor (PARPi) as maintenance therapy after surgery and chemotherapy. Unfortunately, many patients experience disease progression because of acquired resistance. This study aims to characterize epigenetic genomic changes in cell-free DNA (cfDNA) associated PARPi Materials Methods: Blood was taken from 31 EOC receiving before at during/after treatment. Resistance...
Cell-free DNA (cfDNA) are fragments originating from dying cells that detectable in bodily fluids, such as the plasma. Accelerated cell death, for example caused by disease, induces an elevated concentration of cfDNA. As a result, determining type origins cfDNA molecules can provide information about individual's health. In this work, we aim to increase sensitivity methylation-based deconvolution adapting existing method, CelFiE, which uses methylation beta values individual CpG sites...
Co-expression of two genes across different conditions is indicative their involvement in the same biological process. However, when using RNA-Seq datasets with many experimental from diverse sources, only a subset expected to be relevant for finding related particular Gene Ontology (GO) term. Therefore, we hypothesize that purpose find similarly functioning genes, co-expression should not determined on all samples but those informative GO term interest.To address this, developed Metric...
Katja Geertruida Schmahl, Tom Julian Viering, Stavros Makrodimitris, Arman Naseri Jahfari, David Tax, Marco Loog. Proceedings of the Fourth Workshop on Natural Language Processing and Computational Social Science. 2020.
Multi-omic analyses are necessary to understand the complex biological processes taking place at tissue and cell level, but also make reliable predictions about, for example, disease outcome. Several linear methods exist that create a joint embedding using paired information per sample, recently there has been rise in popularity of neural architectures embed -omics into same non-linear manifold. This work describes head-to-head comparison both bulk single-cell multi-modal datasets. We found...
Abstract Deep generative models, such as variational autoencoders (VAE), have gained increasing attention in computational biology due to their ability capture complex data manifolds which subsequently can be used achieve better performance downstream tasks, cancer type prediction or subtyping of cancer. However, these models are difficult train the large number hyperparameters that need tuned. To get a understanding importance different hyperparameters, we examined six VAE when trained on...
Abstract Motivation Protein function prediction is a difficult bioinformatics problem. Many recent methods use deep neural networks to learn complex sequence representations and predict from these. Deep supervised models require lot of labeled training data which are not available for this task. However, very large amount protein sequences without functional labels available. Results We applied an existing model that had been pre-trained in unsupervised setting on the task prediction. found...
Computationally annotating proteins with a molecular function is difficult problem that made even harder due to the limited amount of available labeled protein training data. Unsupervised embeddings partly circumvent this limitation by learning universal representation from many unlabeled sequences. Such incorporate contextual information amino acids, thereby modeling underlying principles sequences insensitive context species. We used an existing pre-trained embedding method and subjected...
Abstract Multi-omic analyses contribute to understanding complex biological processes, but also making reliable predictions about, for example, disease outcomes. Several linear joint dimensionality reduction methods exist, recently neural networks are more commonly used embed different-omics into the same non-linear manifold. We compared embedding using bulk and single-cell data. For modality imputation, had a clear advantage. Comparisons in downstream supervised tasks lead following...
Abstract Next generation sequencing of cell-free DNA (cfDNA) is a promising method for treatment monitoring and therapy selection in metastatic breast cancer (MBC). However, distinguishing tumor-specific variants from artefacts germline variation with low false discovery rate challenging when using large targeted panels covering many tumor suppressor genes. To address this, we built machine learning model to remove positive variant calls augmented it additional filters ensure tumor-derived...
Abstract Motivation Co-expression of two genes across different conditions is indicative their involvement in the same biological process. However, using RNA-Seq datasets with many experimental from diverse sources introduces batch effects and other artefacts that might obscure real co-expression signal. Moreover, only a subset expected to be relevant for finding related particular Gene Ontology (GO) term. Therefore, we hypothesize when purpose find similar functioning should not determined...
Physical interaction between two proteins is strong evidence that the are involved in same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting cellular functions of proteins. However, PPI largely incomplete non-model species. Here, we tested to what extent these still useful genome-wide function prediction. We used network-based classifiers predict Biological Process Gene Ontology terms from protein four species: Saccharomyces...
Abstract Physical interaction between two proteins is strong evidence that the are involved in same biological process, making Protein-Protein Interaction (PPI) networks a valuable data resource for predicting cellular functions of proteins. However, PPI largely incomplete non-model species. Here, we tested to what extened these still useful genome-wide function prediction. We used network-based classifiers predict Biological Process Gene Ontology terms from protein four species:...
Abstract The aged hematopoietic system is characterized by decreased immuno-competence and a reduced number of stem cells (HSCs) that actively generates new blood cell (age-related clonal hematopoiesis, ARCH). While both aspects are commonly associated with an increased risk aging-related diseases, it currently unknown to what extent these co-occur during exceptional longevity. Here, we investigated in immuno-hematopoietically normal female who reached 111 years. Blood samples were collected...
Abstract Computationally annotating proteins with a molecular function is difficult problem that made even harder due to the limited amount of available labelled protein training data. A recently published supervised predicting model partly circumvents this limitation by making its predictions based on universal (i.e. task-agnostic) contextualised embeddings from deep pre-trained unsupervised language SeqVec. SeqVec incorporate contextual information amino acids, thereby modelling underlying...
Abstract Next generation sequencing of cell-free DNA (cfDNA) is a promising method for treatment monitoring and therapy selection in metastatic breast cancer (MBC). However, distinguishing tumor-specific variants from artefacts germline variation with low false discovery rate challenging when using large targeted panels covering many tumor suppressor genes. To address this, we built machine learning model to remove positive variant calls augmented it additional filters ensure tumor-derived...