NFDI4DS | UHH-SEMS - Publication Details

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models

OPENALEX - Publications

Mihály Váradi Stephen Anyango Mandar Deshpande Sreenath Nair Cindy Natassia and 22 more

The AlphaFold Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) is an openly accessible, extensive database of high-accuracy protein-structure predictions. Powered by v2.0 DeepMind, it has enabled unprecedented expansion the structural coverage known protein-sequence space. DB provides programmatic access to and interactive visualization predicted atomic coordinates, per-residue pairwise model-confidence estimates aligned errors. initial release contains over 360,000...

10.1093/nar/gkab1061 article EN cc-by Nucleic Acids Research 2021-10-19

Protein Data Bank: the single global archive for 3D macromolecular structure data

OPENALEX - Publications

S.K. Burley Helen M. Berman Charmi Bhikadiya Chunxiao Bi Li Chen and 77 more

The Protein Data Bank (PDB) is the single global archive of experimentally determined three-dimensional (3D) structure data biological macromolecules. Since 2003, PDB has been managed by Worldwide (wwPDB; wwpdb.org), an international consortium that collaboratively oversees deposition, validation, biocuration, and open access dissemination 3D macromolecular data. Core Archive houses atomic coordinates more than 144 000 structural models proteins, DNA/RNA, their complexes with metals small...

10.1093/nar/gky949 article EN cc-by Nucleic Acids Research 2018-10-05

PDBe: improved findability of macromolecular structure data in the PDB

OPENALEX - Publications

David Armstrong John M. Berrisford M.J. Conroy Aleksandras Gutmanas Stephen Anyango and 26 more

Abstract The Protein Data Bank in Europe (PDBe), a founding member of the Worldwide (wwPDB), actively participates deposition, curation, validation, archiving and dissemination macromolecular structure data. PDBe supports diverse research communities their use structures by enriching PDB data providing advanced tools services for effective access, visualization analysis. This paper details enrichment at PDBe, including mapping RNA to Rfam, identification molecules that act as cofactors. has...

10.1093/nar/gkz990 article EN cc-by Nucleic Acids Research 2019-10-25

PDBe-KB: a community-driven resource for structural and functional annotations

OPENALEX - Publications

Mihály Váradi John M. Berrisford Mandar Deshpande Sreenath Nair Aleksandras Gutmanas and 50 more

Abstract The Protein Data Bank in Europe-Knowledge Base (PDBe-KB, https://pdbe-kb.org) is a community-driven, collaborative resource for literature-derived, manually curated and computationally predicted structural functional annotations of macromolecular structure data, contained the (PDB). goal PDBe-KB two-fold: (i) to increase visibility reduce fragmentation contributed by specialist data resources, make these more findable, accessible, interoperable reusable (FAIR) (ii) place their...

10.1093/nar/gkz853 article EN cc-by Nucleic Acids Research 2019-10-01

PDBe-KB: collaboratively defining the biological context of structural data

OPENALEX - Publications

Mihály Váradi Stephen Anyango David Armstrong John M. Berrisford Preeti Choudhary and 66 more

The Protein Data Bank in Europe - Knowledge Base (PDBe-KB, https://pdbe-kb.org) is an open collaboration between world-leading specialist data resources contributing functional and biophysical annotations derived from or relevant to the (PDB). goal of PDBe-KB place macromolecular structure their biological context by developing standardised exchange formats integrating partner into a knowledge graph that can provide valuable insights. Since we described 2019, there have been significant...

10.1093/nar/gkab988 article EN cc-by Nucleic Acids Research 2021-10-15

Identifying protein conformational states in the Protein Data Bank: Toward unlocking the potential of integrative dynamics studies

OPENALEX - Publications

Joseph I. J. Ellaway Stephen Anyango Sreenath Nair Hossam A. Zaki Nurul Nadzirin and 4 more

Studying protein dynamics and conformational heterogeneity is crucial for understanding biomolecular systems treating disease. Despite the deposition of over 215 000 macromolecular structures in Protein Data Bank advent AI-based structure prediction tools such as AlphaFold2, RoseTTAFold, ESMFold, static representations are typically produced, which fail to fully capture motion. Here, we discuss importance integrating experimental with computational clustering explore landscapes that manifest...

10.1063/4.0000251 article EN cc-by Structural Dynamics 2024-05-01

PDBe: towards reusable data delivery infrastructure at protein data bank in Europe

OPENALEX - Publications

Saqib Mir Younes Alhroub Stephen Anyango David Armstrong John M. Berrisford and 17 more

The Protein Data Bank in Europe (PDBe, pdbe.org) is actively engaged the deposition, annotation, remediation, enrichment and dissemination of macromolecular structure data. This paper describes new developments improvements at PDBe addressing three challenging areas: data enrichment, functional reusability. New features Web site are discussed, including a context dependent menu providing links to raw experimental improved presentation structures solved by hybrid methods. also summarizes...

10.1093/nar/gkx1070 article EN cc-by Nucleic Acids Research 2017-10-26

PDBe and PDBe‐KB: Providing high‐quality, up‐to‐date and integrated resources of macromolecular structures to support basic and applied research and education

OPENALEX - Publications

Mihály Váradi Stephen Anyango Sri Devan Appasamy David Armstrong Marcus Bage and 23 more

Abstract The archiving and dissemination of protein nucleic acid structures as well their structural, functional biophysical annotations is an essential task that enables the broader scientific community to conduct impactful research in multiple fields life sciences. Protein Data Bank Europe (PDBe; pdbe.org ) team develops maintains several databases web services address this fundamental need. From data a member Worldwide PDB consortium (wwPDB; wwpdb.org ), PDBe Knowledge Base (PDBe‐KB;...

10.1002/pro.4439 article EN cc-by Protein Science 2022-09-28

R2DT: a comprehensive platform for visualizing RNA secondary structure

OPENALEX - Publications

Holly M. McCann Caeden D. Meade Loren Dean Williams Anton S. Petrov Philip Z. Johnson and 17 more

RNA secondary (2D) structure visualization is an essential tool for understanding function. R2DT a software package designed to visualize 2D structures in consistent, recognizable, and reproducible layouts. The latest release, 2.0, introduces multiple significant features, including the ability display position-specific information, such as single nucleotide polymorphisms or SHAPE reactivities. It also offers new template-free mode allowing of RNAs without pre-existing templates, alongside...

10.1093/nar/gkaf032 article EN cc-by Nucleic Acids Research 2025-01-14

3D-Beacons: decreasing the gap between protein sequences and structures through a federated network of protein structure data resources

OPENALEX - Publications

Mihály Váradi Sreenath Nair Ian Sillitoe Gerardo Tauriello Stephen Anyango and 27 more

While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, gap between number known protein sequences and experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational modeling approaches. powerful on own, most methods have strengths weaknesses. Therefore, it benefits researchers examine models various model providers perform comparative analysis...

10.1093/gigascience/giac118 article EN GigaScience 2022-01-01

PDBe aggregated API: programmatic access to an integrative knowledge graph of molecular structure data

OPENALEX - Publications

Sreenath Nair Mihály Váradi Nurul Nadzirin Lukáš Pravda Stephen Anyango and 5 more

The PDBe aggregated API is an open-access and open-source RESTful that provides programmatic access to a wealth of macromolecular structural data their functional biophysical annotations through 80+ endpoints. powered by the graph database (https://pdbe.org/graph-schema), integrative knowledge can be used as discovery tool answer complex biological questions.The up-to-date database, which has weekly releases with latest from Protein Data Bank, integrated updated UniProt, Pfam, CATH, SCOP...

10.1093/bioinformatics/btab424 article EN cc-by Bioinformatics 2021-06-02

Unified access to up-to-date residue-level annotations from UniProtKB and other biological databases for PDB data

OPENALEX - Publications

Preeti Choudhary Stephen Anyango John M. Berrisford James Tolchard Mihály Váradi and 1 more

More than 61,000 proteins have up-to-date correspondence between their amino acid sequence (UniProtKB) and 3D structures (PDB), enabled by the Structure Integration with Function, Taxonomy Sequences (SIFTS) resource. SIFTS incorporates residue-level annotations from many other biological resources. data is available in various formats like XML, CSV TSV format or also accessible via PDBe REST API but always maintained separately structure (PDBx/mmCIF file) PDB archive. Here, we extended wwPDB...

10.1038/s41597-023-02101-6 article EN cc-by Scientific Data 2023-04-12

PDBe CCDUtils: an RDKit-based toolkit for handling and analysing small molecules in the Protein Data Bank

OPENALEX - Publications

Ibrahim Roshan Kunnakkattu Preeti Choudhary Lukáš Pravda Nurul Nadzirin Oliver S. Smart and 5 more

While the Protein Data Bank (PDB) contains a wealth of structural information on ligands bound to macromolecules, their analysis can be challenging due large amount and diversity data. Here, we present PDBe CCDUtils, versatile toolkit for processing analysing small molecules from PDB in PDBx/mmCIF format. CCDUtils provides streamlined access all metadata offers set convenient methods compute various properties using RDKit, such as 2D depictions, 3D conformers, physicochemical properties,...

10.1186/s13321-023-00786-w article EN cc-by Journal of Cheminformatics 2023-12-02

Automated Pipeline for Comparing Protein Conformational States in the PDB to AlphaFold2 Predictions

OPENALEX - Publications

Joseph I. J. Ellaway Stephen Anyango Sreenath Nair Hossam A. Zaki Nurul Nadzirin and 4 more

Abstract Proteins, as molecular machines, are necessarily dynamic macromolecules that carry out essential cellular functions. Recognising their stable conformations is important for understanding the mechanisms of disease. While AI-based computational methods have enabled protein structure prediction, prediction dynamics remains a challenge. Here, we present deterministic pipeline clusters experimentally determined structures to comprehensively recognise conformational states across Protein...

10.1101/2023.07.13.545008 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-07-13

PDBeCIF: an open-source mmCIF/CIF parsing and processing package

OPENALEX - Publications

Glen van Ginkel Lukáš Pravda Jose M Dana Mihály Váradi Peter M. Keller and 2 more

Abstract Background Biomacromolecular structural data outgrew the legacy Protein Data Bank (PDB) format which scientific community relied on for decades, yet use of its successor PDBx/Macromolecular Crystallographic Information File (PDBx/mmCIF) is still not widespread. Perhaps one reasons availability easy to tools that only support format, but also inherent difficulties processing mmCIF files correctly, given number edge cases make efficient parsing problematic. Nevertheless, fully exploit...

10.1186/s12859-021-04271-9 article EN cc-by BMC Bioinformatics 2021-07-23

R2DT: a comprehensive platform for visualising RNA secondary structure

OPENALEX - Publications

Holly M. McCann Caeden D. Meade Loren Dean Williams Anton S. Petrov Philip Z. Johnson and 17 more

ABSTRACT RNA secondary (2D) structure visualisation is an essential tool for understanding function. R2DT a software package designed to visualise 2D structures in consistent, recognisable, and reproducible layouts. The latest release, 2.0, introduces multiple significant features, including the ability display position-specific information, such as single nucleotide polymorphisms (SNPs) or SHAPE reactivities. It also offers new template-free mode allowing of RNAs without pre-existing...

10.1101/2024.09.29.611006 preprint EN cc-by-nd bioRxiv (Cold Spring Harbor Laboratory) 2024-09-30

Annotating Macromolecular Complexes in the Protein Data Bank: Improving the FAIRness of Structure Data

OPENALEX - Publications

Sri Devan Appasamy John M. Berrisford Romana Gáborová Sreenath Nair Stephen Anyango and 10 more

Macromolecular complexes are essential functional units in nearly all cellular processes, and their atomic-level understanding is critical for elucidating modulating molecular mechanisms. The Protein Data Bank (PDB) serves as the global repository experimentally determined structures of macromolecules. Structural data PDB offer valuable insights into dynamics, conformation, states biological assemblies. However, current annotation practices lack standardised naming conventions assemblies...

10.1038/s41597-023-02778-9 article EN cc-by Scientific Data 2023-12-01

Unified access to up-to-date residue-level annotations from UniProt and other biological databases for PDB data via PDBx/mmCIF files

OPENALEX - Publications

Preeti Choudhary Stephen Anyango John M. Berrisford Mihály Váradi James Tolchard and 1 more

Abstract More than 58,000 proteins have up-to-date correspondence between their amino acid sequence (UniProtKB) and 3D structures (PDB), enabled by the Structure Integration with Function, Taxonomy Sequences (SIFTS) resource. In addition to this fundamental mapping, SIFTS incorporates residue-level annotations from other biological resources such as Pfam, InterPro, SCOP, SCOP2, CATH, IntEnz, GO, PubMed, Ensembl, NCBI taxonomy database Homologene. The data is exported in XML format per...

10.1101/2022.08.10.503473 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2022-08-13

Annotating Macromolecular Complexes in the Protein Data Bank: Improving the FAIRness of Structure Data

OPENALEX - Publications

Sri Devan Appasamy John M. Berrisford Romana Gáborová Sreenath Nair Stephen Anyango and 10 more

Abstract Macromolecular complexes are essential functional units in nearly all cellular processes, and their atomic-level understanding is critical for elucidating modulating molecular mechanisms. The Protein Data Bank (PDB) serves as the global repository experimentally determined structures of macromolecules. Structural data PDB offer valuable insights into dynamics, conformation, states biological assemblies. However, current annotation practices lack standardised naming conventions...

10.1101/2023.05.15.540692 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-05-15

PDBe CCDUtils: an RDKit-based toolkit for handling and analysing small molecules in the Protein Data Bank

OPENALEX - Publications

Ibrahim Roshan Kunnakkattu Preeti Choudhary Lukáš Pravda Nurul Nadzirin Oliver S. Smart and 5 more

Abstract While the Protein Data Bank (PDB) contains a wealth of structural information on ligands bound to macromolecules, their analysis can be challenging due large amount and diversity data. Here, we present PDBe CCDUtils, versatile toolkit for processing analysing small molecules from PDB in PDBx/mmCIF format. CCDUtils provides streamlined access all metadata offers set convenient methods compute various properties using RDKit, such as 2D depictions, 3D conformers, physicochemical...

10.1101/2023.08.04.552003 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2023-08-07

PDBImages: A Command Line Tool for Automated Macromolecular Structure Visualization

OPENALEX - Publications

Adam Midlik Sreenath Nair Stephen Anyango Mandar Deshpande David Sehnal and 2 more

Summary: PDBImages is an innovative, open-source Node.js package that harnesses the power of popular macromolecule structure visualization software Mol*. Designed for use by scientific community, provides a means to generate high-quality images PDB and AlphaFold DB models. Its unique ability render save directly files in browserless mode sets it apart, offering users streamlined, automated process macromolecular visualization. Here, we detail implementation PDBImages, enumerating its diverse...

10.48550/arxiv.2308.00563 preprint EN cc-by arXiv (Cornell University) 2023-01-01

PDBImages: A Command Line Tool for Automated Macromolecular Structure Visualization

OPENALEX - Publications

Adam Midlik Sreenath Nair Stephen Anyango Mandar Deshpande David Sehnal and 2 more

PDBImages is an innovative, open-source Node.js package that harnesses the power of popular macromolecule structure visualization software Mol*. Designed for use by scientific community, provides a means to generate high-quality images PDB and AlphaFold DB models. Its unique ability render save directly files in browserless mode sets it apart, offering users streamlined, automated process macromolecular visualization. Here, we detail implementation PDBImages, enumerating its diverse image...

10.1093/bioinformatics/btad744 article EN cc-by Bioinformatics 2023-12-01

3D-Beacons: Decreasing the gap between protein sequences and structures through a federated network of protein structure data resources

OPENALEX - Publications

Mihály Váradi Sreenath Nair Ian Sillitoe Gerardo Tauriello Stephen Anyango and 27 more

Abstract While scientists can often infer the biological function of proteins from their 3-dimensional quaternary structures, gap between number known protein sequences and experimentally determined structures keeps increasing. A potential solution to this problem is presented by ever more sophisticated computational modelling approaches. powerful on own, most methods have strengths weaknesses. Therefore, it benefits researchers examine models various model providers perform comparative...

10.1101/2022.08.01.501973 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-08-03