NFDI4DS | UHH-SEMS - Publication Details

Yannick Marcon

ORCID: 0000-0003-0138-2023

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5065133756

Research Areas

Health, Environment, Cognitive Aging
Data Analysis with R
Data Quality and Management
Nutritional Studies and Diet
Advanced Causal Inference Techniques
Genetic Associations and Epidemiology
Health disparities and outcomes
Research Data Management Practices
Scientific Computing and Data Management
Landslides and related hazards
Birth, Development, and Health
Sensor Technology and Measurement Systems
Data-Driven Disease Surveillance
Gene expression and cancer classification
Data Mining Algorithms and Applications
Groundwater flow and contamination studies
Big Data Technologies and Applications
Statistical Methods and Inference
Bioinformatics and Genomic Networks
Hydrological Forecasting Using AI
Distributed and Parallel Computing Systems
Ethics in Clinical Research
Machine Learning in Healthcare
Energy Efficiency and Management
Hydrology and Watershed Management Studies

Epigénétique et Destin Cellulaire
2022-2024

McGill University Health Centre
2013-2018

DataSHIELD: taking the analysis to the data, not the data to the analysis

OPENALEX - Publications

Amadou Gaye Yannick Marcon Julia Isaeva Philippe Laflamme Andrew Turner and 44 more

Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling information individuals central database may queried by researchers raises important ethico-legal questions controversial. In UK this has been highlighted recent debate controversy relating to UK's proposed 'care.data' initiative, these issues reflect societal professional concerns about privacy,...

10.1093/ije/dyu188 article EN cc-by-nc International Journal of Epidemiology 2014-09-27

Data harmonization and federated analysis of population-based studies: the BioSHaRE project

OPENALEX - Publications

Dany Doiron Paul R. Burton Yannick Marcon Amadou Gaye Bruce H. R. Wolffenbuttel and 11 more

Individual-level data pooling of large population-based studies across research centres in international projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence the European Union) project aims to address these issues by building a collaborative group investigators developing tools harmonization, database integration federated analyses.Eight six countries were recruited participate project. Through workshops, teleconferences electronic...

10.1186/1742-7622-10-12 article EN cc-by Emerging Themes in Epidemiology 2013-11-21

Software Application Profile: Opal and Mica: open-source software solutions for epidemiological data management, harmonization and dissemination

OPENALEX - Publications

Dany Doiron Yannick Marcon Isabel Fortier Paul Burton Vincent Ferretti

Improving the dissemination of information on existing epidemiological studies and facilitating interoperability study databases are essential to maximizing use resources accelerating improvements in health. To address this, Maelstrom Research proposes Opal Mica, two inter-operable open-source software packages providing out-of-the-box solutions for data management, harmonization dissemination.Opal Mica standalone but web applications written Java, JavaScript PHP. They provide services...

10.1093/ije/dyx180 article EN cc-by-nc International Journal of Epidemiology 2017-08-09

Orchestrating privacy-protected big data analyses of data from different resources with R and DataSHIELD

OPENALEX - Publications

Yannick Marcon Tom Bishop Demetris Avraam Xavier Escribà-Montagut Patricia Ryser-Welch and 3 more

Combined analysis of multiple, large datasets is a common objective in the health- and biosciences. Existing methods tend to require researchers physically bring data together one place or follow an plan share results. Developed over last 10 years, DataSHIELD platform collection R packages that reduce challenges these methods. These include ethico-legal constraints which limit researchers' ability analytical inflexibility associated with conventional approaches sharing The key feature from...

10.1371/journal.pcbi.1008880 article EN cc-by PLoS Computational Biology 2021-03-30

Fostering population-based cohort data discovery: The Maelstrom Research cataloguing toolkit

OPENALEX - Publications

Julie Bergeron Dany Doiron Yannick Marcon Vincent Ferretti Isabel Fortier

Background The lack of accessible and structured documentation creates major barriers for investigators interested in understanding, properly interpreting analyzing cohort data biological samples. Providing the scientific community with open information is essential to optimize usage these resources. A cataloguing toolkit proposed by Maelstrom Research answer needs support creation comprehensive user-friendly study- network-specific web-based metadata catalogues. Methods Development was...

10.1371/journal.pone.0200926 article EN cc-by PLoS ONE 2018-07-24

Towards an Interoperable Ecosystem of Research Cohort and Real-world Data Catalogues Enabling Multi-center Studies

OPENALEX - Publications

Morris A. Swertz David van Enckevort José Luís Oliveira Isabel Fortier Julie Bergeron and 10 more

Existing individual-level human data cover large populations on many dimensions such as lifestyle, demography, laboratory measures, clinical parameters, etc. Recent years have seen investments in catalogues to FAIRify descriptions capitalise this great promise, i.e. make catalogue contents more Findable, Accessible, Interoperable and Reusable. However, their valuable diversity also created heterogeneity, which poses challenges optimally exploit richness.In opinion review, we analyse for...

10.1055/s-0042-1742522 article EN cc-by-nc-nd Yearbook of Medical Informatics 2022-08-01

Federated privacy-protected meta- and mega-omics data analysis in multi-center studies with a fully open-source analytic platform

OPENALEX - Publications

Xavier Escribà-Montagut Yannick Marcon Augusto Anguita‐Ruiz Demetris Avraam José Urquiza and 4 more

The importance of maintaining data privacy and complying with regulatory requirements is highlighted especially when sharing omic between different research centers. This challenge even more pronounced in the scenario where a multi-center effort for collaborative omics studies necessary. OmicSHIELD introduced as an open-source tool aimed at overcoming these challenges by enabling privacy-protected federated analysis sensitive data. In order to ensure this, multiple security mechanisms have...

10.1371/journal.pcbi.1012626 article EN cc-by PLoS Computational Biology 2024-12-09

MOLGENIS Armadillo: a lightweight server for federated analysis using DataSHIELD

OPENALEX - Publications

Tim Cadman Mariska Slofstra Marije A. van der Geest Demetris Avraam Tom Bishop and 17 more

Abstract Summary Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, extended study periods. Traditionally, this required transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, time constraints. Federated analysis—which involves remote...

10.1093/bioinformatics/btae726 article EN cc-by Bioinformatics 2024-12-02

dsMTL: a computational framework for privacy-preserving, distributed multi-task machine learning

OPENALEX - Publications

Han Cao Youcheng Zhang Jan Baumbach Paul R. Burton Dominic Dwyer and 9 more

In multi-cohort machine learning studies, it is critical to differentiate between effects that are reproducible across cohorts and those cohort-specific. Multi-task (MTL) a approach facilitates this differentiation through the simultaneous of prediction tasks cohorts. Since data can often not be combined into single storage solution, there would substantial utility an MTL application for geographically distributed sources.Here, we describe development 'dsMTL', computational framework...

10.1093/bioinformatics/btac616 article EN cc-by Bioinformatics 2022-09-07

dsMTL - a computational framework for privacy-preserving, distributed multi-task machine learning

OPENALEX - Publications

Han Cao Youcheng Zhang Jan Baumbach Paul R. Burton Dominic Dwyer and 9 more

Abstract Multitask learning allows the simultaneous of multiple ‘communicating’ algorithms. It is increasingly adopted for biomedical applications, such as modeling disease progression. As data protection regulations limit sharing analyses, an implementation multitask on geographically distributed sources would be highly desirable. Here, we describe development dsMTL, a computational framework privacy-preserving, multi-task machine that includes three supervised and one unsupervised dsMTL...

10.1101/2021.08.26.457778 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-08-28

rechaRge – a package for integrated groundwater recharge modelling in R

OPENALEX - Publications

Yannick Marcon Emmanuel Dubois

[9:56 AM] Emmanuel DuboisThe project introduces the new R package, rechaRge, dedicated to open-source groundwater recharge (GWR) models. The goal is facilitate simulation of GWR estimates for researchers, professionals, and stakeholders, both hydrogeologists non-hydrogeologists, by providing all tools state-of-art modelling available models in a single package. package includes functions data preparation (utility functions), automatic calibration, sensitivity analysis, uncertainty integrated...

10.5194/egusphere-egu24-16210 preprint EN 2024-03-09

MOLGENIS Armadillo: a lightweight server for federated analysis using DataSHIELD

OPENALEX - Publications

Tim Cadman Mariska Slofstra Marije van der Geest Demetris Avraam Tom Bishop and 17 more

Summary. Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, extended study periods. Traditionally, this required transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, time constraints. Federated analysis – which involves remote without...

10.31219/osf.io/xc86p preprint EN 2024-10-28

DataSHIELD: Mitigating disclosure risk in a multi-site federated analysis platform

OPENALEX - Publications

Demetris Avraam Rebecca Wilson Noemi Aguirre Chan Soumya Banerjee Tom Bishop and 29 more

Abstract Motivation The validity of epidemiologic findings can be increased using triangulation, i.e. comparison across contexts, and by having sufficiently large amounts relevant data to analyse. However, access is often constrained practical considerations ethico-legal governance restrictions. Gaining such time-consuming due the requirements associated with requests institutions in different jurisdictions. Results DataSHIELD a software solution that enables remote analysis without need for...

10.1093/bioadv/vbaf046 article EN cc-by Bioinformatics Advances 2024-12-26

Software Application Profile: ShinyDataSHIELD—an R Shiny application to perform federated non-disclosive data analysis in multicohort studies

OPENALEX - Publications

Xavier Escribà-Montagut Yannick Marcon Demetris Avraam Soumya Banerjee Tom Bishop and 2 more

Abstract Motivation DataSHIELD is an open-source software infrastructure enabling the analysis of data distributed across multiple databases (federated data) without leaking individuals’ information (non-disclosive). It has applications in many scientific domains, ranging from biosciences to social sciences and including high-throughput genomic studies. R language used interact with (and build) DataSHIELD. This creates difficulties for researchers who do not have experience writing code or...

10.1093/ije/dyac201 article EN cc-by-nc-nd International Journal of Epidemiology 2022-10-27

Worldwide mapping of initiatives that integrate population cohorts

OPENALEX - Publications

Laura Alejandra Rico‐Uribe Daniel Morillo-Cuadrado Ángel Rodríguez‐Laso Ellen Vorstenbosch Andreas J. Weser and 5 more

DATA REPORT article Front. Public Health, 03 October 2022Sec. Life-Course Epidemiology and Social Inequalities in Health Volume 10 - 2022 | https://doi.org/10.3389/fpubh.2022.964086

10.3389/fpubh.2022.964086 article EN cc-by Frontiers in Public Health 2022-10-03

Coming Soon ...