- Scientific Computing and Data Management
- Research Data Management Practices
- Data Quality and Management
- Semantic Web and Ontologies
- Biomedical Text Mining and Ontologies
- Genetics, Bioinformatics, and Biomedical Research
- Horticultural and Viticultural Research
- Health disparities and outcomes
- Bioinformatics and Genomic Networks
- Microbial Community Ecology and Physiology
- Mathematical and Theoretical Epidemiology and Ecology Models
- Birth, Development, and Health
- Evolutionary Game Theory and Cooperation
- Natural Language Processing Techniques
- Ecology and Vegetation Dynamics Studies
- Data Mining Algorithms and Applications
- Genetics, Aging, and Longevity in Model Organisms
- Animal Ecology and Behavior Studies
- Gene expression and cancer classification
- Management, Economics, and Public Policy
- Adipose Tissue and Metabolism
- Topic Modeling
- Evolution and Genetic Dynamics
- Health, Environment, Cognitive Aging
- Actinomycetales infections and treatment
University Medical Center Groningen
2015-2024
University of Groningen
2015-2024
Vrije Universiteit Amsterdam
2001-2004
The volume and complexity of biological data increases rapidly. Many clinical professionals biomedical researchers without a bioinformatics background are generating big '-omics' data, but do not always have the tools to manage, process or publicly share these data.Here we present MOLGENIS Research, an open-source web-application collect, analyze, visualize large complex datasets, need for advanced skills.MOLGENIS Research is freely available (open source software). It can be installed from...
Data in the life sciences are extremely diverse and stored a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG pathway or UniProt protein data) to that general-purpose FigShare, Zenodo, Dataverse EUDAT). These have widely different levels sensitivity security considerations. For example, clinical observations about genetic mutations patients highly sensitive, while species diversity generally not. The lack uniformity models one repository...
There is an urgent need to standardize the semantics of biomedical data values, such as phenotypes, enable comparative and integrative analyses. However, it unlikely that all studies will use same collection protocols. As a result, retrospective standardization often required, which involves matching original (unstructured or locally coded) widely used coding ontology systems SNOMED CT (clinical terms), ICD-10 (International Classification Disease) HPO (Human Phenotype Ontology). This...
Data in the life sciences are extremely diverse and stored a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG pathway or UniProt protein data) to that general-purpose FigShare, Zenodo, EUDat). These have widely different levels sensitivity security considerations. For example, clinical observations about genetic mutations patients highly sensitive, while species diversity generally not. The lack uniformity models one repository another,...
Abstract Motivation: While the size and number of biobanks, patient registries other data collections are increasing, biomedical researchers still often need to pool for statistical power, a task that requires time-intensive retrospective integration. Results: To address this challenge, we developed MOLGENIS/connect, semi-automatic system find, match from different sources. The shortlists relevant source attributes thousands candidates using ontology-based query expansion overcome variations...
Abstract Summary Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, extended study periods. Traditionally, this required transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, time constraints. Federated analysis—which involves remote...
Biobanks are indispensable for large-scale genetic/epidemiological studies, yet it remains difficult researchers to determine which biobanks contain data matching their research questions.To overcome this, we developed a new algorithm that identifies pairs of related elements between and variables with high precision recall. It integrates lexical comparison, Unified Medical Language System ontology tagging semantic query expansion. The result is BiobankUniverse, fast matchmaking service...
Summary. Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, extended study periods. Traditionally, this required transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, time constraints. Federated analysis – which involves remote without...
Data in the life sciences are extremely diverse and stored a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG pathway or UniProt protein data) to that general-purpose FigShare, Zenodo, Dataverse EUDAT). These have widely different levels sensitivity security considerations. For example, clinical observations about genetic mutations patients highly sensitive, while species diversity generally not. The lack uniformity models one repository...
Data in the life sciences are extremely diverse and stored a broad spectrum of repositories ranging from those designed for particular data types (such as KEGG pathway or UniProt protein data) to that general-purpose FigShare, Zenodo, Dataverse EUDAT). These have widely different levels sensitivity security considerations. For example, clinical observations about genetic mutations patients highly sensitive, while species diversity generally not. The lack uniformity models one repository...