- Health, Environment, Cognitive Aging
- Data Analysis with R
- Data Quality and Management
- Nutritional Studies and Diet
- Advanced Causal Inference Techniques
- Genetic Associations and Epidemiology
- Health disparities and outcomes
- Research Data Management Practices
- Scientific Computing and Data Management
- Landslides and related hazards
- Birth, Development, and Health
- Sensor Technology and Measurement Systems
- Data-Driven Disease Surveillance
- Gene expression and cancer classification
- Data Mining Algorithms and Applications
- Groundwater flow and contamination studies
- Big Data Technologies and Applications
- Statistical Methods and Inference
- Bioinformatics and Genomic Networks
- Hydrological Forecasting Using AI
- Distributed and Parallel Computing Systems
- Ethics in Clinical Research
- Machine Learning in Healthcare
- Energy Efficiency and Management
- Hydrology and Watershed Management Studies
Epigénétique et Destin Cellulaire
2022-2024
McGill University Health Centre
2013-2018
Research in modern biomedicine and social science requires sample sizes so large that they can often only be achieved through a pooled co-analysis of data from several studies. But the pooling information individuals central database may queried by researchers raises important ethico-legal questions controversial. In UK this has been highlighted recent debate controversy relating to UK's proposed 'care.data' initiative, these issues reflect societal professional concerns about privacy,...
Individual-level data pooling of large population-based studies across research centres in international projects faces many hurdles. The BioSHaRE (Biobank Standardisation and Harmonisation for Research Excellence the European Union) project aims to address these issues by building a collaborative group investigators developing tools harmonization, database integration federated analyses.Eight six countries were recruited participate project. Through workshops, teleconferences electronic...
Improving the dissemination of information on existing epidemiological studies and facilitating interoperability study databases are essential to maximizing use resources accelerating improvements in health. To address this, Maelstrom Research proposes Opal Mica, two inter-operable open-source software packages providing out-of-the-box solutions for data management, harmonization dissemination.Opal Mica standalone but web applications written Java, JavaScript PHP. They provide services...
Combined analysis of multiple, large datasets is a common objective in the health- and biosciences. Existing methods tend to require researchers physically bring data together one place or follow an plan share results. Developed over last 10 years, DataSHIELD platform collection R packages that reduce challenges these methods. These include ethico-legal constraints which limit researchers' ability analytical inflexibility associated with conventional approaches sharing The key feature from...
Background The lack of accessible and structured documentation creates major barriers for investigators interested in understanding, properly interpreting analyzing cohort data biological samples. Providing the scientific community with open information is essential to optimize usage these resources. A cataloguing toolkit proposed by Maelstrom Research answer needs support creation comprehensive user-friendly study- network-specific web-based metadata catalogues. Methods Development was...
Existing individual-level human data cover large populations on many dimensions such as lifestyle, demography, laboratory measures, clinical parameters, etc. Recent years have seen investments in catalogues to FAIRify descriptions capitalise this great promise, i.e. make catalogue contents more Findable, Accessible, Interoperable and Reusable. However, their valuable diversity also created heterogeneity, which poses challenges optimally exploit richness.In opinion review, we analyse for...
The importance of maintaining data privacy and complying with regulatory requirements is highlighted especially when sharing omic between different research centers. This challenge even more pronounced in the scenario where a multi-center effort for collaborative omics studies necessary. OmicSHIELD introduced as an open-source tool aimed at overcoming these challenges by enabling privacy-protected federated analysis sensitive data. In order to ensure this, multiple security mechanisms have...
Abstract Summary Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, extended study periods. Traditionally, this required transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, time constraints. Federated analysis—which involves remote...
In multi-cohort machine learning studies, it is critical to differentiate between effects that are reproducible across cohorts and those cohort-specific. Multi-task (MTL) a approach facilitates this differentiation through the simultaneous of prediction tasks cohorts. Since data can often not be combined into single storage solution, there would substantial utility an MTL application for geographically distributed sources.Here, we describe development 'dsMTL', computational framework...
Abstract Multitask learning allows the simultaneous of multiple ‘communicating’ algorithms. It is increasingly adopted for biomedical applications, such as modeling disease progression. As data protection regulations limit sharing analyses, an implementation multitask on geographically distributed sources would be highly desirable. Here, we describe development dsMTL, a computational framework privacy-preserving, multi-task machine that includes three supervised and one unsupervised dsMTL...
[9:56 AM] Emmanuel DuboisThe project introduces the new R package, rechaRge, dedicated to open-source groundwater recharge (GWR) models. The goal is facilitate simulation of GWR estimates for researchers, professionals, and stakeholders, both hydrogeologists non-hydrogeologists, by providing all tools state-of-art modelling available models in a single package. package includes functions data preparation (utility functions), automatic calibration, sensitivity analysis, uncertainty integrated...
Summary. Extensive human health data from cohort studies, national registries, and biobanks can reveal lifecourse risk factors impacting health. Combining these sources offers increased statistical power, rare outcome detection, replication of findings, extended study periods. Traditionally, this required transfer to a central location or separate partner analyses with pooled summary statistics, posing ethical, legal, time constraints. Federated analysis – which involves remote without...
Abstract Motivation The validity of epidemiologic findings can be increased using triangulation, i.e. comparison across contexts, and by having sufficiently large amounts relevant data to analyse. However, access is often constrained practical considerations ethico-legal governance restrictions. Gaining such time-consuming due the requirements associated with requests institutions in different jurisdictions. Results DataSHIELD a software solution that enables remote analysis without need for...
Abstract Motivation DataSHIELD is an open-source software infrastructure enabling the analysis of data distributed across multiple databases (federated data) without leaking individuals’ information (non-disclosive). It has applications in many scientific domains, ranging from biosciences to social sciences and including high-throughput genomic studies. R language used interact with (and build) DataSHIELD. This creates difficulties for researchers who do not have experience writing code or...
DATA REPORT article Front. Public Health, 03 October 2022Sec. Life-Course Epidemiology and Social Inequalities in Health Volume 10 - 2022 | https://doi.org/10.3389/fpubh.2022.964086