Genetic Validation of Psoriasis Phenotyping in UK Biobank Supports the Utility of Self-Reported Data and Composite Definitions for Large Genetic and Epidemiological Studies

Genome-wide Association Study Genetic epidemiology
DOI: 10.1016/j.jid.2023.02.010 Publication Date: 2023-03-03T02:17:35Z
ABSTRACT
In dermatology and elsewhere, GWAS meta-analyses now routinely include data from large-scale population-based biobanks (Zhou et al., 2022Zhou W. Kanai M. Wu K.-H.H. Rasheed H. Tsuo K. Hirbo J.B. al.Global Biobank Meta-analysis Initiative: powering genetic discovery across human disease.Cell Genomics. 2022; 2100192Abstract Full Text PDF Scopus (41) Google Scholar). Many examples (Boutin 2020Boutin T.S. Charteris D.G. Chandra A. Campbell S. Hayward C. al.Insights into the basis of retinal detachment.Hum Mol Genet. 2020; 29: 689-702Crossref PubMed (16) Scholar; Han 2020Han Y. Jia Q. Jahani P.S. Hurrell B.P. Pan Huang P. al.Genome-wide analysis highlights contribution immune system pathways to architecture asthma.Nat Commun. 11: 1776Crossref (71) Mitchell 2022Mitchell B.L. Saklatvala J.R. Dand N. Hagenbeek F.A. Li X. Min J.L. association meta-analysis identifies 29 new acne susceptibility loci.Nat 13: 702Crossref (9) Tachmazidou 2019Tachmazidou I. Hatzikotoulas Southam L. Esparza-Gordillo J. Haberland V. Zheng al.Identification therapeutic targets for osteoarthritis through genome-wide analyses UK data.Nat 2019; 51: 230-236Crossref (227) Scholar) have used Biobank, a study >500,000 participants aged 40–70 years with self-reported electronic health record–derived clinical diagnoses (Bycroft 2018Bycroft Freeman Petkova D. Band G. Elliott L.T. Sharp al.The resource deep phenotyping genomic data.Nature. 2018; 562: 203-209Crossref (2866) However, correct interpretation or epidemiological associations identified in biobank should acknowledge that cases selected via study-specific self-report record procedures may be subject misclassification different disease phenotype on average than those ascertained specialist setting typically molecular studies processes (Cai 2020Cai Revez J.A. Adams M.J. Andlauer T.F.M. Breen Byrne E.M. al.Minimal yields signals low specificity major depression.Nat 52: 437-447Crossref (125) We focus chronic plaque psoriasis, reporting framework uses effect size estimates evaluate consistency between candidate phenotypes psoriasis diagnosed by physician. Specifically, we assess degree which definitions capture nonpsoriasis cases—or (presumably milder) lower liability typical specialist-ascertained cases—by regressing estimated sizes at established loci against reference values obtained previous case cohorts recruitment was based in-person diagnosis (Tsoi 2017Tsoi L.C. Stuart P.E. Tian Gudjonsson J.E. Das Zawistowski al.Large scale characterizes common associated variants.Nat 2017; 815382Crossref (194) (Figure 1). Our inverse variance–weighted regression slope bound positive predictive value (minPPV) true among definition (full details are provided Supplementary Materials Methods). validate our approach dermatologist-derived case-control GWASs simulated known rate (Supplementary Methods, Figure S1, Table S4). applied method (unrelated White British after quality control; N = 336,733), can defined using single source (self-reporting, linked general practitioner [GP] Hospital Episode Statistics; 1, S1), combinations thereof. Among single-source definitions, (NSRP 4,244) most concordant specialist-diagnosed (minPPVSRP 66.9%, 95% confidence interval [CI]: 61.2–72.6%), even more so psoriasis-relevant medication (N 1,927; minPPV 73.9%, CI: 65.2–82.6%). Psoriasis Statistics (HES) fewer (NHESany 1,726) were less (minPPVHESany 57.9%, 48.9–66.8%). GP-based least (NGP 5,768; minPPVGP 46.4%, 40.5–52.3%), albeit improving when multiple GP required (NGP2 2,422; minPPVGP2 58.6%, 50.3–66.9%).Table 1List Selected Candidate Phenotypes, Abbreviations, Case Numbers before Genotyping QC, IVW Estimate, Power Detect Common (MAF 30%) Risk Factor Weak Effect (OR 1.1)AbbreviationPhenotype DescriptionNumber Cases (All)Number (Genotyped, Unrelated)IVW Regression Slope (∼minPPV)Mean (95% CI)(vs Controls, n 141,279)Power (vs Controls)Single sourceSRPSelf-reported psoriasis6,1104,2440.669 (0.612–0.726)0.478SRPMSelf-reported relevant psoriasis2,7501,9270.739 (0.652–0.826)0.296HESmainPsoriasis as main HES4492890.605 (0.422–0.788)0.077HESsecPsoriasis secondary HES2,3001,5320.587 (0.491–0.683)0.175HESanyPsoriasis HES2,5931,7260.579 (0.489–0.668)0.178GPrawPsoriasis data, read codes corresponding ICD-10 mapping file11,5607,9560.324 (0.279–0.370)0.243GPPsoriasis curated list codes8,4445,7680.464 (0.405–0.523)0.340GP2Two codes3,4722,4220.586 (0.503–0.669)0.242GP3Three codes1,9841,3890.614 (0.515–0.714)0.172Combined sources1-SRP-HESanyAny one SRP HESany7,5685,1940.624 (0.570–0.677)0.4991-SRP-GPAny GP12,6168,6470.517 (0.471–0.563)0.5431-SRP-GP2Any GP28,3205,7860.615 (0.559–0.670)0.5381-SRP-HESany-GPAny SRP, HESany GP13,6669,3160.508 (0.463–0.553)0.5611-SRP-HESany-GP2Any GP29,5466,5740.585 (0.535–0.636)0.5422-SRP-HESany-GPAny two GP3,0562,1220.721 (0.638–0.805)0.303All-SRP-HESany-GPAll three GP4253000.818 (0.616–1.020)0.1052-SRP-SRM-HESany-GPAny SRM, GP5,0803,4990.696 (0.628–0.763)0.4432-SRP-SRM-HESany-GP2Any GP24,2912,9650.726 (0.660–0.792)0.4163-SRP-SRM-HESany-GPAny GP1,5991,1220.771 (0.675–0.866)0.216All-SRP-SRM-HESany-GPAll four GP2621850.870 (0.637–1.104)0.084Phenotypes incorporating PsA codesSRP+PsASelf-reported psoriatic arthritis6,6364,6030.664 (0.606–0.721)0.503SRPM+PsASelf-reported PsA, medication3,0132,1070.747 (0.661–0.832)0.330HESany+PsAPsoriasis HES3,3882,2720.616 (0.530–0.703)0.252GP+PsAPsoriasis codes8,8086,0240.457 (0.398–0.517)0.3481-SRP-HESany-GP+PsAAny SRP+PsA, HESany+PsA GP+PsA14,4759,8640.510 (0.464–0.555)0.5822-SRP-HESany-GP+PsAAny GP+PsA3,7192,5790.713 (0.629–0.797)0.3482-SRP-SRM-HESany-GP+PsAAny SRM+PsA, GP+PsA5,6923,9170.688 (0.620–0.756)0.468Abbreviations: CI, interval; GP, practitioner; HES, IVW, variance-weighted; MAF, minor allele frequency; minPPV, (i.e., slope); OR, odds ratio; arthritis; psoriasis; SRPM, medication. Open table tab Abbreviations: recognize large sample afforded offset limitations stringency considering statistical power detect novel S2). therefore an risk factor (population frequency 0.3, ratio 1.1; 1) (details results other scenarios presented Methods S3). demonstrated highest (powerSRP 47.8%), substantially higher larger but (powerGP 34.0%). then considered composite sources. Requiring coding any conferred limited agreement specialist-defined (minPPV1-SRP-HESany-GP 50.8%, 46.3–55.3%) numbers such exceeded all (N1-SRP-HESany-GP 9,316; power1-SRP-HESany-GP 56.1%). independent corroborative codings improved concordance ∼70% (minPPV2-SRP-HESany-GP 72.2%, 63.8–80.5%; minPPV2-SRP-SRM-HESany-GP 69.6%, 62.8–76.3%) although (power2-SRP-HESany-GP 30.3%; power2-SRP-SRM-HESany-GP 44.3%) remained top-performing 47.8%). sources high (minPPVAll-SRP-HESany-GP 81.8%, 61.6–102.0%; minPPVAll-SRP-SRM-HESany-GP 87.0%, 63.7–110.4%) CIs crossing 100%. This is consistent control GWASs, had 0.9 1.1 1 S4), smallest cohort (n 464 cases) being only exception. (67%) much 23andMe (36%) due ascertainment differences: rather online questionnaire, interviewed trained research nurse seen doctor each reported condition (UK, 2012UK BiobankUK 100235: verbal interview within ACE centres.https://biobank.ctsu.ox.ac.uk/showcase/ukb/docs/TouchscreenQuestionsMainFinal.pdfDate: 2012Date accessed: March 7, 2022Google Primary care hospital minPPVs self-reporting owing because difficulty nonspecialist differential lesional skin diseases. Alternatively, patients primary episodes (in recorded secondary) milder average, consequently reduced liability, included dermatologist-diagnosed GWASs; work showed 90% subsequently confirmed reviewers (Seminara 2011Seminara N.M. Abuabara Shin D.B. Langan S.M. Kimmel S.E. Margolis al.Validity The Health Improvement Network (THIN) psoriasis.Br J Dermatol. 2011; 164: 602-609PubMed relatively indicators represent not also mild severe disease. without formal validation exercise, methods here unable distinguish these scenarios. biology conducted moderate-severe remain valuable measure aggregate equivalent dermatologist-ascertained psoriasis. optimal future investigations will depend specific aims. recommend research, priority, defines (and maximum 58% interpreted context contributing meta-analyses); requiring accurate encouraged use inclusion arthritis diagnostic beneficial minimal drop-off It remains unclear whether unaffected arthritis–only having cutaneous coded shares balances both validity power; generalization this finding datasets method. To facilitate assessments, composition assembling record/questionnaire-based studies. available bona fide researchers health-related public interest (https://www.ukbiobank.ac.uk/enable-your-research). Biomarkers Systemic Treatment Outcomes (BSTOP) approved making application BSTOP Data Access Committee (https://www.kcl.ac.uk/lsm/research/divisions/gmm/departments/dermatology/research/stru/groups/bstop/documents). Jake R. Saklatvala: http://orcid.org/0000-0003-0836-4928 Ken B. Hanscombe: http://orcid.org/0000-0002-3715-6805 Satveer Mahil: http://orcid.org/0000-0003-4692-3794 Lam Tsoi: http://orcid.org/0000-0003-1627-5722 James T. Elder: http://orcid.org/0000-0003-4215-3294 Jonathan Barker: http://orcid.org/0000-0002-9030-183X Michael Simpson: http://orcid.org/0000-0002-8539-8753 Catherine Smith: http://orcid.org/0000-0001-9918-1144 Nick Dand: http://orcid.org/0000-0002-1805-6278 SKM reports departmental income AbbVie, Almirall, Eli Lilly, Novartis, Sanofi, UCB, outside submitted work. CHS principal investigator MRC (PSORT) EC-funded consortia industry partners (see PSORT.org.uk, BIOMAP-IMI.eu, HIPPOCRATES-IMI.eu up-to-date listings contributory partners), co-supervisor PhD studentships MRC/industry collaboration (Boehringer Ingelheim GmbH). remaining authors state no conflict interest. project has received funding Innovative Medicines Initiative 2 Joint Undertaking (JU) under grant number 821511 (Biomarkers Atopic Dermatitis Psoriasis). JU receives support European Union's Horizon 2020 innovation program Federation Pharmaceutical Industries Associations. publication reflects author's view responsible made information it contains. been (approved 15147) collected NHS part their support. ND Research (MR/S003126/1), funded Medical Council, Engineering Physical Sciences Economic Social Department Care (England), Chief Scientist Office Scottish Government Directorates, Development Division (Welsh Government), Public Agency (Northern Ireland), Heart Foundation, Wellcome Trust. Clinical Academic Partnership Award (MR/T02383X/1). would like thank Association ongoing since inception (reference: RG2/10: RG2/10). invaluable National Institute networks its facilitating BSTOP. Members Study Group who contributed collection samples profiling (excluding individually named work) Nadia Aldoori, Mahmud Ali, Alex Anstey, Fiona Antony, Charles Archer, Suzanna August, Periasamy Balasubramaniam, Kay Baxter, Anthony Bewley, Alexandra Bonsall, Victoria Brown, Katya Burova, Aamir Butt, Mel Caswell, Sandeep Cliff, Mihaela Costache, Sharmela Darne, Emily Davies, Claudia DeGiovanni, Trupti Desai, Bernadette DeSilva, Diba, Eva Domanne, Harvey Dymond, Caoimhe Fahy, Leila Ferguson, Maria-Angeliki Gkini, Alison Godwin, Hammonds, Sarah Johnson, Teresa Joseph, Manju Kalavala, Mohsen Khorshid, Liberta Labinoti, Nicole Lawson, Layton, Tara Lees, Levell, Helen Lewis, Calum Lyon, Sandy McBride, Sally McCormack, Kevin McKenna, Serap Mellor, Ruth Murphy, Paul Norris, Caroline Owen, Urvi Popli, Gay Perera, Nabil Ponnambath, Ramsay, Aruni Ranasinghe, Saskia Reeken, Rebecca Rose, Rada Rotarescu, Ingrid Salvary, Kathy Sands, Tapati Sinha, Simina Stefanescu, Kavitha Sundararaj, Taghipour, Michelle Taylor, Thomson, Joanne Topliffe, Roberto Verdolini, Rachel Wachsmuth, Martin Wade, Shyamal Wahie, Walsh, Shernaz Walton, Louise Wilcox, Andrew Wright. Conceptualization: ND; Curation: SKM, CHS; Formal Analysis: JRS, KBH; Funding Acquisition: JNB, MAS, CHS, Investigation: Methodology: Project Administration: Resources: LCT, JTE; Supervision: Writing - Original Draft Preparation: Review Editing: KBH, JTE, 15147. prospective 40–69 recruited 2006–2010 (2933) continues collect extensive phenotypic genotypic detail about participants, including questionnaires, physical measures, assays, accelerometry, multimodal imaging, genotyping, longitudinal follow-up wide range outcomes. Linkage comprises document inpatient visits (currently 230,105 participants). Service Ethics (approval nos. 11/NW/0382, 16/NW/0274), written informed consent. nine type (Table 1): illnesses medications, data. given S1. Self-reported medications prescription taken regularly time assessment center visit (UK field 20003). full reviewed dermatologists (SKM, CHS) identify medications. Linked types code: readV2 readCTV3 (NHS Digital, 2020NHS Digital. Read codes, https://digital.nhs.uk/services/terminology-and-classifications/read-codes; (accessed 7 2021).Google types. further distinguished corresponded International Classification Diseases, 10th Revision, L40 file 1: GPraw) Biobank,UK Biobank. 592, https://biobank.ndph.ox.ac.uk/ukb/refer.cgi?id=592; 2021 previously validated GP) Validated code lists format mapped combining These ranged broader sufficient, stricter Because presents lesions, additional expanded self-report, (Ogdie 2013Ogdie Love Haynes Seminara al.Prevalence treatment patterns UK.Rheumatology (Oxford). 2013; 568-575Crossref (106) Scholar), 1; S1). central team performed genotype calling imputation. Affymetrix BiLEVE Axiom array ∼50,000) ∼450,000) Based metrics removed exhibited gender mismatch, excess relatedness, heterozygosity, missingness > 5% extracted individuals determined form unrelated subset homogeneous (White British) ancestry. call rates (<98%) well-called (>90%) markers, giving 336,814 subsequent (336,733 withdrawals). Genome-wide imputation IMPUTE2 software panel derived UK10K 1,000 Genomes phase 3 haplotypes (Howie 2011Howie Marchini Stephens Genotype thousands genomes.G3(Bethesda). 457-470Crossref (710) Scholar, Howie 2009Howie B.N. Donnelly A flexible next generation studies.PLoS 2009; 5e1000529Crossref (2897) For analysis, variants R2 0.7 0.5%. testing 35 section) generated. Each set cases. controls, negative 141,279). fitted logistic variant PLINK v2.0 (Chang 2015Chang C.C. Chow Tellier Vattikuti Purcell Lee J.J. Second-generation PLINK: rising challenge richer datasets.GigaScience. 2015; 4: 7Crossref (5161) 20 ancestry components genotyping covariates. derive instrument representative summary statistics seven (totaling 13,229 21,543 controls) (197) analyzed (IVW) fixed meta-analysis. 38 significant (P < 5 × 10−8) Mb apart. excluded histocompatibility complex region chromosome 6: strong HLA-C∗06:02 age onset means locus strongly influenced strategy, comparison complex. Of lead variants, imputed dataset, whereas unavailable suitable proxy found LDLink platform (Machiela Chanock, 2015Machiela Chanock S.J. LDlink: web-based exploring population-specific haplotype structure linking correlated alleles possible functional variants.Bioinformatics. 31: 3555-3557Crossref (1011) definition, (betas) regressed S5), weighted variance give weight confident estimates, function mr_ivw R package MendelianRandomization (version 0.5.0) (Yavorska Burgess, 2017Yavorska O.O. Burgess MendelianRandomization: performing Mendelian randomization summarized data.Int Epidemiol. 46: 1734-1739Crossref (690) line gives indication how depressed, 1); indicate already understand accuracy affected stringent alternative models "unselected" controls observed slightly slopes unselected With assumption group GWAS, depression relative could driven misclassified (compared Tsoi al.). incorrect misdiagnosis nondermatologists At locus, (PPV; positives / + false positives) magnitude effect. relationship PPV complex, undertook simulations inform slopes. Using v1.9 representing SNVs instrument. simulation, 125,000 individuals, 5,000 mix (true posi
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (22)
CITATIONS (2)