Michel Dumontier
- Biomedical Text Mining and Ontologies
- Semantic Web and Ontologies
- Scientific Computing and Data Management
- Research Data Management Practices
- Bioinformatics and Genomic Networks
- Computational Drug Discovery Methods
- Data Quality and Management
- Genomics and Phylogenetic Studies
- Service-Oriented Architecture and Web Services
- Privacy-Preserving Technologies in Data
- Genetics, Bioinformatics, and Biomedical Research
- Machine Learning in Healthcare
- Ethics in Clinical Research
- Pharmacogenetics and Drug Metabolism
- Music and Audio Processing
- Artificial Intelligence in Healthcare
- Natural Language Processing Techniques
- Electronic Health Records Systems
- Big Data and Business Intelligence
- Gene Regulatory Network Analysis
- Gene expression and cancer classification
- Genomics and Rare Diseases
- Microbial Metabolic Engineering and Bioproduction
- Pharmacovigilance and Adverse Drug Reactions
- scientometrics and bibliometrics research
Maastricht University
2016-2025
Economie Publique
2023
Research Institute for Knowledge Systems
2023
Carleton University
2007-2022
Stanford University
2012-2022
Brandenburg-Berliner Institut für Sozialwissenschaftliche Studien
2021
Vrije Universiteit Amsterdam
2018
Stanford Medicine
2015-2018
Rensselaer Polytechnic Institute
2017
Biological E (India)
2017
There is an urgent need to improve the infrastructure supporting reuse of scholarly data. A diverse set stakeholders-representing academia, industry, funding agencies, and publishers-have come together design jointly endorse a concise measureable principles that we refer as FAIR Data Principles. The intent these may act guideline for those wishing enhance reusability their data holdings. Distinct from peer initiatives focus on human scholar, Principles put specific emphasis enhancing ability...
The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research provide this information, as well tools enable data analysis, freely researchers worldwide. BIND are curated into a comprehensive machine-readable archive of computable information provides users with methods discover mechanisms. has...
The FAIR Data Principles propose that all scholarly output should be Findable, Accessible, Interoperable, and Reusable.As a set of guiding principles, expressing only the kinds behaviours researchers expect from contemporary data resources, how principles manifest in reality was largely open to interpretation.As support for has spread, so breadth these interpretations.In observing this creeping spread interpretation, several original authors felt it now appropriate revisit Principles,...
The Ontology for Biomedical Investigations (OBI) is an ontology that provides terms with precisely defined meanings to describe all aspects of how investigations in the biological and medical domains are conducted. OBI re-uses ontologies provide a representation biomedical knowledge from Open Biological Ontologies (OBO) project adds ability this was derived. We here state several applications using it, such as adding semantic expressivity existing databases, building data entry forms,...
Abstract The effectiveness of most cancer targeted therapies is short-lived. Tumors often develop resistance that might be overcome with drug combinations. However, the number possible combinations vast, necessitating data-driven approaches to find optimal patient-specific treatments. Here we report AstraZeneca’s large combination dataset, consisting 11,576 experiments from 910 across 85 molecularly characterized cell lines, and results a DREAM Challenge evaluate computational strategies for...
The FAIR principles have been widely cited, endorsed and adopted by a broad range of stakeholders since their publication in 2016. By intention, the 15 guiding do not dictate specific technological implementations, but provide guidance for improving Findability, Accessibility, Interoperability Reusability digital resources. This has likely contributed to adoption principles, because individual stakeholder communities can implement own solutions. However, it also resulted inconsistent...
The Semanticscience Integrated Ontology (SIO) is an ontology to facilitate biomedical knowledge discovery. SIO features a simple upper level comprised of essential types and relations for the rich description arbitrary (real, hypothesized, virtual, fictional) objects, processes their attributes. specifies design patterns describe associate qualities, capabilities, functions, quantities, informational entities including textual, geometrical, mathematical entities, provides specific extensions...
Despite a large and multifaceted effort to understand the vast landscape of phenotypic data, their current form inhibits productive data analysis. The lack community-wide, consensus-based, human- machine-interpretable language for describing phenotypes genomic environmental contexts is perhaps most pressing scientific bottleneck integration across many key fields in biology, including genomics, systems development, medicine, evolution, ecology, systematics. Here we survey phenomics...
The FAIR Principles 1 (https:/
Adverse events resulting from drug-drug interactions (DDI) pose a serious health issue. The ability to automatically extract DDIs described in the biomedical literature could further efforts for ongoing pharmacovigilance. Most of neural networks-based methods typically focus on sentence sequence identify these DDIs, however shortest dependency path (SDP) between two entities contains valuable syntactic and semantic information. Effectively exploiting such information may improve DDI...
Abstract Transparent evaluations of FAIRness are increasingly required by a wide range stakeholders, from scientists to publishers, funding agencies and policy makers. We propose scalable, automatable framework evaluate digital resources that encompasses measurable indicators, open source tools, participation guidelines, which come together accommodate domain relevant community-defined FAIR assessments. The components the are: (1) Maturity Indicators – community-authored specifications...
Reproducibility and reusability of research results is an important concern in scientific communication science policy. A foundational element reproducibility the open persistently available presentation data. However, many common approaches for primary data publication use today do not achieve sufficient long-term robustness, openness, accessibility or uniformity. Nor they permit comprehensive exploitation by modern Web technologies. This has led to several authoritative studies...
PubChem is an open repository for chemical structures, biological activities and biomedical annotations. Semantic Web technologies are emerging as increasingly important approach to distribute integrate scientific data. Exposing data services may help enable automated integration management, well facilitate interoperable web applications.This work, one of a series covering the PubChemRDF project, describes translate Substance Compound information into Resource Description Framework (RDF)...
In many disciplines, data are highly decentralized across thousands of online databases (repositories, registries, and knowledgebases). Wringing value from such depends on the discipline science humble bricks mortar that make integration possible; identifiers a core component this infrastructure. Drawing our experience work by other groups, we outline 10 lessons have learned about identifier qualities best practices facilitate large-scale integration. Specifically, propose actions...
Cheminformatics is the application of informatics techniques to solve chemical problems in silico. There are many areas biology where cheminformatics plays an important role computational research, including metabolism, proteomics, and systems biology. One critical aspect these fields accurate exchange data, which increasingly accomplished through use ontologies. Ontologies formal representations objects their properties using a logic-based ontology language. Many such ontologies currently...
Although potential drug–drug interactions (PDDIs) are a significant source of preventable drug-related harm, there is currently no single complete PDDI information. In the current study, all publically available sources information that could be identified using comprehensive and broad search were combined into dataset. The dataset merged fourteen different including 5 clinically-oriented sources, 4 Natural Language Processing (NLP) Corpora, Bioinformatics/Pharmacovigilance sources. As...
In recent years, as newer technologies have evolved around the healthcare ecosystem, more and data been generated. Advanced analytics could power collected from numerous sources, both institutions, or generated by individuals themselves via apps devices, lead to innovations in treatment diagnosis of diseases; improve care given patient; empower citizens participate decision-making process regarding their own health well-being. However, sensitive nature prohibits organizations sharing data....