- Advanced Proteomics Techniques and Applications
- Mass Spectrometry Techniques and Applications
- Metabolomics and Mass Spectrometry Studies
- Analytical Chemistry and Chromatography
- Glycosylation and Glycoproteins Research
- History and advancements in chemistry
- Monoclonal and Polyclonal Antibodies Research
- Machine Learning in Bioinformatics
- Computational Drug Discovery Methods
- Analytical chemistry methods development
- Molecular Biology Techniques and Applications
- Advanced Chemical Sensor Technologies
- Various Chemistry Research Topics
- Carbohydrate Chemistry and Synthesis
- Biosensors and Analytical Detection
- Protein purification and stability
- Analytical Chemistry and Sensors
- Probiotics and Fermented Foods
- Radioactive element chemistry and processing
- Chemical Thermodynamics and Molecular Structure
- RNA Research and Splicing
- Microbial Natural Products and Biosynthesis
- Infant Nutrition and Health
- Cancer Genomics and Diagnostics
- SARS-CoV-2 and COVID-19 Research
National Institute of Standards and Technology
2015-2025
Material Measurement Laboratory
2012-2025
National Institute of Standards
2009-2025
Lomonosov Moscow State University
2013
This paper documents the design, layout and algorithms of IUPAC International Chemical Identifier, InChI.
Since its public introduction in 2005 the IUPAC InChI chemical structure identifier standard has become international, worldwide for defined structures. This article will describe extensive use and dissemination of InChIKey representations by world-wide chemistry community, information major publishers disseminators related scientific offerings manuscripts databases.
Recent progress in metabolomics and the development of increasingly sensitive analytical techniques have renewed interest global profiling, i.e., semiquantitative monitoring all chemical constituents biological fluids. In this work, we performed profiling NIST SRM 1950, "Metabolites Human Plasma", using GC-MS, LC-MS, NMR. Metabolome coverage, difficulties, reproducibility experiments on each platform are discussed. A total 353 metabolites been identified material. GC-MS provides 65 unique...
A description of the methods used to build a high quality, comprehensive reference library electron-ionization mass spectra is presented. Emphasis placed on most challenging part this project--the improvement quality by expert evaluation. The employed for task were developed over course spectrum-by-spectrum review containing well 100,000 spectra. Although effectiveness depended critically expertise evaluators, number guidelines are discussed which found be effective in performing onerous and...
A major unmet need in LC-MS/MS-based proteomics analyses is a set of tools for quantitative assessment system performance and evaluation technical variability. Here we describe 46 metrics monitoring chromatographic performance, electrospray source stability, MS1 MS2 signals, dynamic sampling ions MS/MS, peptide identification. Applied to data sets from replicate LC-MS/MS analyses, these displayed consistent, reasonable responses controlled perturbations. The typically variations less than...
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics data sets from the mass spectrometric interrogation of tumor samples previously analyzed by Cancer Genome Atlas (TCGA) program. availability genomic and proteomic is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) nonreference markers cancer. CPTAC laboratories have focused on colon, breast, ovarian tissues first round analyses; spectra these were 2D liquid...
A mass spectral library search algorithm that identifies compounds differ from by a single “inert” structural component is described. This algorithm, the Hybrid Similarity Search, generates similarity score based on matching both fragment ions and neutral losses. It employs parameter DeltaMass, defined as difference between query compounds, to shift loss peaks in spectrum match corresponding spectrum. When spectra being compared feature, these should contain feature. method extends scope of...
We present a mass spectral library-based method to identify tandem spectra of peptides that contain unanticipated modifications and amino acid variants. describe this as "hybrid" because it combines matching both ion m/z losses. The loss is the difference between an peak its precursor. This difference, termed DeltaMass, used shift product ions in library spectrum modification, thereby allowing unexpected modification match query spectrum. Clustered unidentified from Clinical Proteomic Tumor...
Metabolomics has a critical need for better tools mass spectral identification. Common metabolites may be identified by searching libraries of tandem spectra, which offers important advantages over other approaches to But are not nearly complete enough represent the full molecular diversity present in complex biological samples. We novel hybrid search method that can help identify library similarity compounds are. call it "hybrid" because combines conventional, direct peak matching with...
A method to discover and correct errors in mass spectral libraries is described. Comparing across a set of highly curated reference compounds that have the same chemical structure quickly identifies entries are outliers. In cases where three or more for compound compared, outlier as determined by visual inspection was almost always found contain error. These were either spectrum itself descriptors accompanied it. The demonstrated on finding forensic interest NIST/EPA/NIH Mass Spectral...
Spectral library searching (SLS) is an attractive alternative to sequence database (SDS) for peptide identification due its speed, sensitivity, and ability include any selected mass spectra. While decoy methods SLS have been developed low accuracy spectral libraries, it not clear that they are optimal or directly applicable high Therefore, we report the development validation of libraries. Two types libraries were found be suitable this purpose. The first, referred as Reverse, constructs...
While gas chromatography mass spectrometry (GC-MS) has long been used to identify compounds in complex mixtures, this process is often subjective and time-consuming leaves a large fraction of seemingly good-quality spectra unidentified. In work, we describe set new spectral library-based methods assist compound identification mixtures. These employ uniqueness ubiquity library entries alongside noise reduction automated comparison retention indices compounds. As test data set, publicly...
InChIKey is a 27-character compacted (hashed) version of InChI which intended for Internet and database searching/indexing based on an SHA-256 hash the character string. The first block encodes molecular skeleton while second represents various kinds isomerism (stereo, tautomeric, etc.). designed to be nearly unique substitute parent InChI. However, single may occasionally map two or more strings (collision). appearance collision itself does not compromise signature as collision-free hashing...
We describe the creation of a mass spectral library composed all identifiable spectra derived from tryptic digest NISTmAb IgG1κ. The is unique reference collection developed over six million peptide-spectrum matches acquired by liquid chromatography-mass spectrometry (LC-MS) wide range collision energy. Conventional one-dimensional (1D) LC-MS was used for various digestion conditions and 20- 24-fraction two-dimensional (2D) studies permitted in-depth analyses single digests. Computer methods...
Abstract Recent reports have demonstrated that genetically variant peptides derived from human hair shaft proteins can be used to differentiate individuals of different biogeographic origins. We report a method involving direct extraction more sensitive than previously published methods regarding GVP detection. It involves one step for protein and was found provide reproducible results. A detailed proteomic analysis this data is presented led the following four results: (i) peptide spectral...
We report the comparison of mass-spectral-based abundances tryptic glycopeptides to fluorescence released labeled glycans and effects mass charge state in-source fragmentation on glycopeptide abundances. The primary glycoforms derived from Rituximab, NISTmAb, Evolocumab, Infliximab were high-mannose biantennary complex galactosylated fucosylated N-glycans. Except for ions loss HexNAc or HexNAc-Hex sugars are prominent other therapeutic IgGs. After excluding results, a linear correlation was...
Derivitization of peptides with isobaric tags such as iTRAQ and TMT is widely employed in proteomics due to their compatibility multiplex quantitative measurements. We recently made publicly available a large peptide library derived from 4-plex labeled spectra. This resource has not been used for identifying related different masses, because values virtually all masses precursor most product ions would differ containing the well tag-specific peaks. describe method interconverting spectra (6-...
Glycopeptide Abundance Distribution Spectra (GADS) were recently introduced as a means of representing, storing, and comparing glycan profiles intact glycopeptides. Here, using that representation, an extensive analysis is made multiple commercial sources the recombinant SARS-CoV-2 spike protein, each containing 22 N-linked sites (sequons). Multiple proteases are used along with variable energy fragmentation followed by ion trap confirmation. This enables detailed examination reproducibility...
A method for representing and comparing distributions of N-linked glycans located at specific sites on proteins is presented. The representation takes the form a simple mass spectrum given peptide sequence, with each peak corresponding to different glycopeptide. (in place m/z) that glycan mass, its abundance corresponds relative in electrospray MS1 spectrum. This provides facile means all identifiable glycopeptides arising from single protein "sequon" thereby enabling comparison searching...
Annotating product ion peaks in tandem mass spectra is essential for evaluating spectral quality and validating peptide identification. This task more complex glycopeptides crucial the confident determination of glycosylation sites glycoproteins. MS_Piano (Mass Spectrum Peptide Annotation) software was developed reliable annotation collision induced dissociation (CID) peptides or N-glycopeptides given sequences, charge states, optional modifications. The program annotates each peak high low...
We present a mass spectral library-based method for analyzing site-specific N-linked protein glycosylation. Its operation and utility are illustrated by applying it to both newly measured available proteomics data of human milk glycoproteins. It generates two varieties libraries. One contains glycopeptide abundance distribution spectra (GADS). The other tandem the underlying glycopeptides. Both originate from identified glycopeptides in proteolytic digests purified glycoproteins, which...
This work presents a detailed determination of site-specific N-glycan distributions the recombinant influenza glycoproteins hemagglutinin and neuraminidase. Variation in glycosylation among is not predictable can depend on details biomanufacturing process as well protein structure. In this study, proteins were analyzed from eight strains four different suppliers. These include five three neuraminidase proteins, each produced HEK293 cell line. Digestion was conducted using series complex...