A new pipeline for the normalization and pooling of metabolomics data
Pooling
Normalization
Data set
DOI:
10.1101/2021.07.16.452593
Publication Date:
2021-07-17T00:15:32Z
AUTHORS (47)
ABSTRACT
Abstract Pooling metabolomics data across studies is often desirable to increase the statistical power of analysis. However, this can raise methodological challenges as several preanalytical and analytical factors could introduce differences in measured concentrations variability between datasets. Specifically, different may use variable sample types (e.g., serum versus plasma) collected, treated stored according protocols, assayed laboratories using instruments. To address these issues, a new pipeline was developed normalize pool through set sequential steps: (i) exclusions least informative observations metabolites removal outliers; imputation missing data; (ii) identification main sources PC-PR2 analysis; (iii) application linear mixed models remove unwanted variability, including samples’ originating study batch, preserve biological variations while accounting for potential residual variances studies. This applied targeted acquired Biocrates AbsoluteIDQ kits eight case-control nested within European Prospective Investigation into Cancer Nutrition (EPIC) cohort. Comprehensive examination measurements indicated that improved comparability Our be adapted other molecular data, biomarkers well proteomics used pooling datasets, example international consortia, limit biases introduced by inter-study variability. versatility makes our work interest epidemiologists.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (32)
CITATIONS (0)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....