On the cross-population generalizability of gene expression prediction models

RNA-Seq Genetic architecture Replicate
DOI: 10.1371/journal.pgen.1008927 Publication Date: 2020-08-14T19:24:44Z
ABSTRACT
The genetic control of gene expression is a core component human physiology. For the past several years, transcriptome-wide association studies have leveraged large datasets linked genotype and RNA sequencing information to create powerful gene-based test that has been used in dozens studies. While numerous discoveries made, populations training data are overwhelmingly European descent, little known about generalizability these models other populations. Here, we for cross-population prediction using dataset African American individuals with RNA-Seq whole blood. We find default trained such as GTEx DGN fare poorly Americans, notable reduction accuracy when compared Americans. replicate limitations five GEUVADIS dataset. Via realistic simulations both expression, show accurate transcriptome only arises eQTL architecture substantially shared across In contrast, non-identical eQTLs showed patterns similar real-world data. Therefore, generating diverse critical step towards multi-ethnic utility prediction.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (77)
CITATIONS (59)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....