Discovery of O-GlcNAc-modified Proteins in Published Large-scale Proteome Data

Proteome Human proteome project
DOI: 10.1074/mcp.m112.019463 Publication Date: 2012-06-02T05:37:37Z
ABSTRACT
The attachment of N-acetylglucosamine to serine or threonine residues (O-GlcNAc) is a post-translational modification on nuclear and cytoplasmic proteins with emerging roles in numerous cellular processes, such as signal transduction, transcription, translation. It further presumed that O-GlcNAc can exhibit site-specific, dynamic possibly functional interplay phosphorylation. are commonly identified by tandem mass spectrometry following some form biochemical enrichment. In the present study, we assessed if, which extent, O-GlcNAc-modified be discovered from existing large-scale proteome data sets. To this end, conceived straightforward identification strategy based our recently developed Oscore software automatically analyzes spectra for presence intensity diagnostic fragment ions. Using Oscore, hundreds peptides not initially these studies, most have been described before. Merely re-searching extended number known almost 100 suggesting exists even more widely than previously anticipated often sufficiently abundant detected without However, comparison phospho-identifications very same indicates considerably less discovery doubly modified (i.e. one multiple phosphate moieties), suggests phosphorylation necessarily mutually exclusive, but occur simultaneously at adjacent sites. 1The abbreviations used are:O-GlcNAcO-linked N-acetylglucosamineHCDhigher collision energy dissociationHexNAcN-acetylgalactosamin, N-acetylglucosaminNSAFnormalized spectral abundance factorPSMpeptide-spectrum-match.1The factorPSMpeptide-spectrum-match. an proteins. found wide range involved virtually all processes well various human diseases (1Hart G.W. Slawson C. Ramirez-Correa G. Lagerlof O. Cross talk between O-GlcNAcylation phosphorylation: signaling, chronic disease.Annu. Rev. Biochem. 2011; 80: 825-858Crossref PubMed Scopus (920) Google Scholar, 2Hu P. Shimoji S. Hart Site-specific regulation.FEBS Lett. 2010; 584: 2526-2538Crossref (132) Scholar) including cancer (3Slawson signalling: implications cell biology.Nat. Cancer. 11: 678-684Crossref (330) Scholar). addition, phosphorylation, which, instance, modulates stability activity p53 (4Yang W.H. Kim J.E. Nam H.W. Ju J.W. H.S. Y.S. Cho Modification O-linked regulates stability.Nat. Cell Biol. 2006; 8: 1074-1083Crossref (340) Despite its biological importance, analysis remains highly challenging. fact, ∼800 reported proteins, direct unambiguous evidence site O-glycosylation available 25% (5Wang J. Torii M. Liu H. Hu Z.Z. dbOGAP - integrated bioinformatics resource protein O-GlcNAcylation.BMC Bioinformatics. 12: 91Crossref (83) higher dissociation N-acetylgalactosamin, N-acetylglucosamin normalized factor peptide-spectrum-match. typically achieved combining selective enrichment liquid chromatography (LC-MS/MS). Albeit powerful, sites hindered substoichiometric occupancy (2Hu lability O-glycosidic bond gas phase (6Huddleston M.J. Bean M.F. Carr S.A. Collisional fragmentation glycopeptides electrospray ionization LC/MS LC/MS/MS: methods detection digests.Anal. Chem. 1993; 65: 877-884Crossref (371) spectrometry-based proteomics, usually sequenced via collision-induced (CID). under typical CID conditions, concurrent peptide difficult, because readily lose GlcNAc moiety, dominated neutral loss species along oxonium ion fragments thereof (7Chalkley R.J. Burlingame A.L. Identification GlcNAcylation alpha-crystallin using Q-TOF spectrometry.J. Am. Soc. Mass Spectrom. 2001; 1106-1113Crossref (71) Peptide sequence still possible lost information irretrievably upon bond. contrast, electron capture (ECD) transfer (ETD) preserves PTM allows simultaneous sequences (8Mirgorodskaya E. Roepstorff Zubarev R.A. Localization Fourier transform spectrometer.Anal. 1999; 71: 4431-4436Crossref (348) 9Vosseller K. Trinidad J.C. Chalkley Specht C.G. Thalhammer A. Lynn A.J. Snedecor J.O. Guan Medzihradszky K.F. Maltby D.A. Schoepfer R. proteomics postsynaptic density preparations lectin weak affinity spectrometry.Mol. Cell. Proteomics. 5: 923-934Abstract Full Text PDF (285) techniques also shortcomings notably concerning sensitivity current commercial platforms. Although ideal localization, initial strongly facilitated CID-type experiments (10Haynes P.A. Aebersold Simultaneous glycoproteins chromatography-tandem spectrometry.Anal. 2000; 72: 5402-5410Crossref (65) 11Chalkley novel O-N-acetylglucosamine serum response quadrupole time-of-flight 2003; 2: 182-190Abstract (44) losses define characteristic pattern, identifies complex samples (9Vosseller availability high resolution accuracy instruments improves selectivity ions (12Hahne Kuster B. A two-stage approach scoring scheme peptides.J. 22: 931-942Crossref (21) 13Zhao Viner Teo C.F. Boons G.J. Horn D. Wells L. Combining high-energy C-trap assignment.J. Proteome Res. 10: 4088-4104Crossref (120) We tool, termed assesses MS and, turn, ranking according their probability representing On test set 750 11,300 unmodified peptides, was able discriminate 95% >99% specificity outperformed alternative approaches simple filtering show applied proteomic discover studies. enough specific Publically raw spectrometric published proteome-wide studies 11 different lines (14Geiger T. Wehner Schaab Cox Mann Comparative eleven common reveals ubiquitous varying expression proteins.Mol. 2012; (M111.014050)Abstract (579) Scholar), HeLa cells (15Nagaraj N. Wisniewski J.R. Geiger Kircher Kelso Paabo Deep transcriptome mapping line.Mol. Syst. 7: 548Crossref (757) phospho-proteome hES iPS (16Phanstiel D.H. Brumbaugh Wenger C.D. Tian Probasco M.D. Bailey D.J. Swaney D.L. Tervo M.A. Bolin J.M. Ruotti V. Stewart Thomson J.A. Coon J.J. Proteomic phosphoproteomic ES cells.Nat. Methods. 821-827Crossref (217) were downloaded respective repositories (see supplemental Table S1). processed essentially Briefly, peak picking processing performed Mascot Distiller 2.4.2.0 (Matrix Science, London, UK) merging precursor isotope fitting below m/z 205 disabled. resulting list files perl script, calculates every spectrum contains least feature within tolerance 10 ppm. searched 2.3.0 against UniProtKB complete (download date 26.10.2010, 110,550 sequences) combined contaminants. case dataset subset database generated Scaffold 3.3.1 (Proteome Software, Portland, OR) only identifications full (11,288 sequences). Carbamidomethylation cysteine residues, oxidation methionine, HexNAc serine, asparagine taken into account variable modifications. Where applicable, tyrosine modification. Likewise, 4-plex 8-plex iTRAQ fixed amino terminus lysine side chain sources tags. According proteases employed original enzyme trypsin (lysine, arginine), LysC (lysine), GluC (aspartic acid, glutamic acid) allowing up two missed cleavage definition detail Fig. S1. target-decoy option enabled ppm 0.02 Da. Search results imported 3.3.1. Proteins required 99% 80% (supplemental S2). Candidate filtered false-positive peptide-spectrum-matches (PSMs) retain PSMs Oscores smaller 2.3. inspected validated manually Spectra). murine compiled recent publications (13Zhao 17Wang Z. Udeshi N.D. Compton P.D. Sakabe Cheung W.D. Shabanowitz Hunt D.F. Extensive crosstalk cytokinesis.Sci. Signal. 3: ra2Crossref (251) 18Chalkley native peptides.Proc. Natl. Acad. Sci. U.S.A. 2009; 106: 8894-8899Crossref (199) 19Myers Panning Polycomb repressive 2 necessary normal site-specific distribution mouse embryonic stem cells.Proc. 108: 9490-9495Crossref (106) databases PhosphositePlus (20Hornbeck P.V. Kornhauser Tkachev Zhang Skrzypek Murray Latham Sullivan PhosphoSitePlus: comprehensive investigating structure function experimentally determined modifications man mouse.Nucleic Acids 40: D261-270Crossref (1141) Information phosphorylated ubiquitinylated retrieved database. Reported N-linked glycosylation extracted UniProtKB, subcellular localization Ingenuity Pathway Analysis (Ingenuity Systems, Redwood City, CA). script www.wzw.tum.de/proteomics/content/research/software/; peaklist ProteomeCommons.org Tranche hash key: ChunHqKHVaLCoocgKoyBjphK1QntOh6ehU0MzuLgwf+FZHjEfAntIyzzY38Rv051iVNoNFNJQHibLYJl4dDRotCm1UAAAAAAAAEpg==(passphrase: sa3sh7mgcf6eolskt57p). means assess represent score increased provided modern spectrometers. therefore reasoned it may identify if so, overall sets public S1), acquired dual pressure linear trap Orbitrap hybrid spectrometers HCD (21Olsen J.V. Schwartz Griep-Raming Nielsen M.L. Damoc Denisov Lange Remes Taylor Splendore Wouters E.R. Senko Makarov Horning instrument sequencing speed.Mol. 2759-2769Abstract (379) first comprises label-free Scholar); second characterization line employing protease digestion third represents iTRAQ-based quantitative four (hES) induced pluripotent (iPS) Together, constitute 13,897,945 spectra. re-analysis, combines standard searching Oscoring assessment potential (Fig. 1A). Both algorithms exploit complementary properties reflects information, solely Given particular behavior alone accurately non-O-GlcNAc 1B). when assigned re-assessed easily Low strong spectra, unlikely no absence features. Oscore-based then desired FDR while maintaining adequate 1C). re-analysis three resulted 158 containing 194 628 (Table I). Manual interpretation best PSM allowed 26 12 below). 13 could narrowed down 140 remained ambiguous. An example depicted Spectra annotated spectra). large facilitate SQSAAVTPSgSTTSSTR ADRM1, support nine has low during conditions render difficult. Clearly, method choice accurate O-GlcNAC ETD, retains enables localization. stretches around actual impede Only five out single (Ser, Thr), average per 5.6. This consistent transferase consensus motifs Interestingly, nonmodified contain 1.5 acceptor phospho-peptides below) harbor 3.3 sites, likely serine/threonine-rich peptides.Table IO-GlcNAc studiesProjectMS/MSPSMPeptidesSitesProteinsGeiger et al.5,985,62045410412576Nagaraj al.4,829,52575363829Phanstiel al.1,766,56699415032Total12,581,711628158anonredundant proteins.194anonredundant proteins.114anonredundant proteins.Phanstiel al. (phospho set)1,316,234107283422Total + phospho13,897,945735174anonredundant proteins.204anonredundant proteins.124anonredundant proteins.a nonredundant Open table new tab Among localized NX[ST] motif. 20 reliably deduced UniProt (also see S4). generally expected result accordance previous findings (18Chalkley explanation raised artifacts formed lysis cytosolic endo-β-N-acetylglucosaminidase. cleaves β-1,4-glycosidic N,N′-diactylchitobiose core mannose leaving residue. N-GlcNAc O-glycans, arise in-source glycan region front end spectrometer. After million corresponding 114 candidate Tables S3–S5). re-examined contribute exclusive 3A). highest originates proteomes profiled Within varies significantly S3) reflect cell-type differences O-GlcNAcylation. deep Nagaraj contributed significant though part panel analyzed closer inspection revealed 16 18 originate (7 proteins) digests (nine proteins), underscoring usefulness general particular. Host 1, O-GlcNAcylated. note ten residue (N-GlcNAc). Moreover, although compartments extracellular (22Matsuura Ito Sakaidani Y. Kondo Murakami Furukawa Nadano Matsuda Okajima domain notch receptors.J. 2008; 283: 35486-35495Abstract (137) cannot rule possibility several ER- Golgi-resident early synthesis products O-GalNAc-type glycans. 3B. For 47 reported, 57 report time. Collectively, shows data, makes point favor sharing scientific community. us perform crude estimation frequency From (11 lines), 2,023,960 6124 correspond 454 matched peptides. Hence, phospho-spectra 1 334 4500 indicating numerically ∼13-fold frequent aware rests assumption O-GlcNAcylated are, large, rate phosphopeptides (although probably approximately true). expressed logarithmic (23Zybailov Mosley Sardiu M.E. Coleman M.K. Florens Washburn M.P. Statistical membrane changes Saccharomyces cerevisiae.J. 2339-2347Crossref (820) (NSAF, 4A). As expected, mostly among medium somewhat unexpectedly, NSAF distributions O-GlcNAc- phospho-proteins quite similar. clearly observed equally abundant, frequent. Alternatively, intensities 4B) proxy (modified) ordinary massively skewed toward many phospho-peptides. hypothesis, estimated summed By method, 0.73 0.90 difference supported fact counterpart 46% phospho-peptides, 26% do realize above estimates efficiencies grossly justified. Still, think appears At time, however, rather abundant) stably physiological conditions. vitro substrates constitutively (24Shen Gloster T.M. Yuzwa Vocadlo Insights dynamics through kinetic O-GlcNAcase substrates.J. 287: 15395-15408Abstract (25Hart Housley Cycling beta-N-acetylglucosamine nucleocytoplasmic proteins.Nature. 2007; 446: 1017-1022Crossref (1081) investigated whether data. Oscore-strategy Overall, 107 28 34 22 I S6–S8). Of 67% moieties. phosphorylated, surprising given 50% notion, cross-talk identical proximal extensive referred being either antagonistic synergistic Most cases literature competitive neighboring argued reciprocal exclusion size (with Stokes radius fivefold larger moiety) negative charge group conformational (26Chen Y.X. Du J.T. Zhou L.X. X.H. Zhao Y.F. Nakanishi Li Y.M. Alternative O-GlcNAcylation/O-phosphorylation Ser16 induce disturbances N estrogen receptor beta.Chem. 13: 937-944Abstract (73) observation 23 median length 24 suggest both distal protein, occupied simultaneously. striking SEApSg(SS)PPVVTSSSHSR SOX2 transcription factor. Here, Spectrum #208) localizes S4 S5 S6, can, (almost) Numerous S9) highlight role histone code regulation (27Hanover Epigenetics gets sweeter: joins "histone code".Chem. 17: 1272-1274Abstract (32) 1Hart identified, H2B particularly interesting close proximity (di-)methylation, ubiquitination, 5). S113 has, recently, monoubiquitination K121. here, moiety seems act primer ubiquitin ligase, presumably transcriptional activation (28Fujiki Hashiba W. Sekine Yokoyama Chikanishi Imai He H.H. Igarashi Kanno Ohtake F. Kitagawa Roeder R.G. Brown Kato facilitates monoubiquitination.Nature. 480: 557-560Crossref (229) precise T53 S65 unknown, might speculate about relationships ubiquitination. Further noteworthy examples include factors SOX-2 Sal-like 4 (SALL4) STAT3, SALL4 (19Myers yet STAT3 (29Whelan Lane Regulation insulin signaling.J. 21411-21417Abstract (124) T714 T721 #193). SALL4, found: S480 T501, T608, S609, S612; additional T608 S628 Spectra: #203, 149, 156, respectively). All identity governing cell-renewal (30Boyer L.A. Lee T.I. Cole Johnstone S.E. Levine S.S. Zucker J.P. Guenther M.G. Kumar R.M. H.L. Jenner Gifford D.K. Melton Jaenisch Young Core regulatory circuitry cells.Cell. 2005; 122: 947-956Abstract (3518) 31Zhang Tam W.L. Tong G.Q. Wu Q. Chan H.Y. Soh B.S. Lou Yang Ma Chai Ng Lufkin Robson Lim Sall4 pluripotency development Pou5f1.Nat. 1114-1123Crossref (449) up-regulating genes down-regulating developmental genes. finding regulate maintain repertoire revisited >13 phosphoproteome hundred so they study laboratories expect vast quantities work those other use indicate infancy. "in passing" were, much highlights need better tools.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (31)
CITATIONS (57)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....