Rabie Saidi

ORCID: 0000-0002-0449-5253
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Machine Learning in Bioinformatics
  • Bioinformatics and Genomic Networks
  • Genomics and Phylogenetic Studies
  • Biomedical Text Mining and Ontologies
  • Data Mining Algorithms and Applications
  • Gene expression and cancer classification
  • Protein Structure and Dynamics
  • Scientific Computing and Data Management
  • Enzyme Structure and Function
  • Advanced Proteomics Techniques and Applications
  • Genetics, Bioinformatics, and Biomedical Research
  • Rough Sets and Fuzzy Logic
  • Computational Drug Discovery Methods
  • Molecular Biology Techniques and Applications
  • Research Data Management Practices
  • Data Management and Algorithms
  • Microbial Natural Products and Biosynthesis
  • Metabolomics and Mass Spectrometry Studies
  • Natural Language Processing Techniques
  • Semantic Web and Ontologies
  • RNA and protein synthesis mechanisms
  • Microbial Metabolic Engineering and Bioproduction
  • Geological and Geochemical Analysis
  • Geological and Geophysical Studies Worldwide
  • Graph Theory and Algorithms

European Bioinformatics Institute
2014-2024

SIB Swiss Institute of Bioinformatics
2024

Universitat Politècnica de Catalunya
2020

Institut Supérieur d’Informatique, de Modélisation et de leurs Applications
2014

University of Sfax
2014

Centre National de la Recherche Scientifique
2007-2012

Clermont Université
2012

Université Clermont Auvergne
2012

Laboratoire d'Informatique, de Modélisation et d'Optimisation des Systèmes
2011-2012

Institut Pascal
2010

Alex Bateman María Martin Sandra Orchard Michele Magrane Rahat Agivetova and 95 more Shadab Ahmad Emanuele Alpi Emily Bowler-Barnett Ramona Britto Borisas Bursteinas Hema Bye‐A‐Jee Ray Coetzee Austra Cukura Alan Da Silva Paul Denny Tunca Doğan ThankGod E. Ebenezer Jun Fan Leyla Jael Castro Penelope Garmiri George P. Georghiou Leonardo Jose da Costa Gonzales Emma Hatton-Ellis Abdulrahman Hussein Alexandr Ignatchenko Giuseppe Insana Rizwan Ishtiaq Petteri Jokinen Vishal Joshi Dushyanth Jyothi Antonia Lock Rodrigo López Aurélien Luciani Jie Luo Yvonne Lussi Alistair MacDougall Fábio Madeira Mahdi Mahmoudy M. Menchi Alok Mishra Katie Moulang Andrew Nightingale Carla Susana Oliveira Sangya Pundir Guoying Qi Shriya Raj Daniel L Rice M. Rodríguez-López Rabie Saidi J. H. Sampson Tony Sawford Elena Speretta E. B. Turner Nidhi Tyagi Preethi Vasudev Vladimir Volynkin Kate Warner Xavier Watkins Rossana Zaru Hermann Zellner Alan Bridge Sylvain Poux Nicole Redaschi Lucila Aimo Ghislaine Argoud‐Puy Andrea Auchincloss Kristian B. Axelsen Parit Bansal Delphine Baratin Marie-Claude Blatter Jerven Bolleman Emmanuel Boutet Lionel Breuza Cristina Casals‐Casas Leyla Jael Castro Kamal Chikh Echioukh Elisabeth Coudert Beatrice Cuche Mikael Doche Dolnide Dornevil Anne Estreicher Maria Livia Famiglietti Marc Feuermann Elisabeth Gasteiger Sébastien Géhant Vivienne Baillie Gerritsen Arnaud Gos Nadine Gruaz-Gumowski Ursula Hinz Chantal Hulo Nevila Hyka‐Nouspikel Florence Jungo G. Keller Arnaud Kerhornou V. Lara Philippe Le Mercier Damien Lieberherr Thierry Lombardot Xavier Martín Patrick Masson

The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set protein sequences annotated functional information. In this article, we describe significant updates that have made over last two years resource. number in UniProtKB has risen approximately 190 million, despite continued work reduce sequence redundancy at proteome level. We adopted new methods assessing completeness quality. continue extract detailed annotations from...

10.1093/nar/gkaa1100 article EN cc-by Nucleic Acids Research 2020-11-02
Alex Bateman María Martin Sandra Orchard Michele Magrane Shadab Ahmad and 95 more Emanuele Alpi Emily Bowler-Barnett Ramona Britto Hema Bye‐A‐Jee Austra Cukura Paul Denny Tunca Doğan ThankGod E. Ebenezer Jun Fan Penelope Garmiri Leonardo Jose da Costa Gonzales Emma Hatton-Ellis Abdulrahman Hussein Alexandr Ignatchenko Giuseppe Insana Rizwan Ishtiaq Vishal Joshi Dushyanth Jyothi Swaathi Kandasaamy Antonia Lock Aurélien Luciani Marija Lugaric Jie Luo Yvonne Lussi Alistair MacDougall Fábio Madeira Mahdi Mahmoudy Alok Mishra Katie Moulang Andrew Nightingale Sangya Pundir Guoying Qi Shriya Raj Pedro Raposo Daniel L Rice Rabie Saidi Rafael Santos Elena Speretta James Stephenson Prabhat Totoo E. B. Turner Nidhi Tyagi Preethi Vasudev Kate Warner Xavier Watkins Rossana Zaru Hermann Zellner Alan Bridge Lucila Aimo Ghislaine Argoud‐Puy Andrea Auchincloss Kristian B. Axelsen Parit Bansal Delphine Baratin Teresa M Batista Neto Marie-Claude Blatter Jerven Bolleman Emmanuel Boutet Lionel Breuza Blanca Cabrera Gil Cristina Casals‐Casas Kamal Chikh Echioukh Elisabeth Coudert Beatrice Cuche Edouard de Castro Anne Estreicher Maria Livia Famiglietti Marc Feuermann Elisabeth Gasteiger Pascale Gaudet Sébastien Géhant Vivienne Baillie Gerritsen Arnaud Gos Nadine Gruaz Chantal Hulo Nevila Hyka‐Nouspikel Florence Jungo Arnaud Kerhornou Philippe Le Mercier Damien Lieberherr Patrick Masson Anne Morgat Venkatesh Muthukrishnan Salvo Paesano Ivo Pedruzzi Sandrine Pilbout Lucille Pourcel Sylvain Poux Monica Pozzato Manuela Pruess Nicole Redaschi Catherine Rivoire Christian Sigrist Karin Sonesson Shyamala Sundaram

Abstract The aim of the UniProt Knowledgebase is to provide users with a comprehensive, high-quality and freely accessible set protein sequences annotated functional information. In this publication we describe enhancements made our data processing pipeline website adapt an ever-increasing information content. number in UniProtKB has risen over 227 million are working towards including reference proteome for each taxonomic group. We continue extract detailed annotations from literature...

10.1093/nar/gkac1052 article EN cc-by Nucleic Acids Research 2022-11-21
Naihui Zhou Yuxiang Jiang Timothy Bergquist Alexandra Lee Balint Z. Kacsoh and 95 more Alex W. Crocker Kimberley A. Lewis George P. Georghiou Huy Nguyen Md-Nafiz Hamid L. Taylor Davis Tunca Doğan Volkan Atalay Ahmet Süreyya Rifaioğlu Alperen Dalkıran Rengül Çetin-Atalay Chengxin Zhang Rebecca L. Hurto Peter L. Freddolino Yang Zhang Prajwal Bhat Fran Supek José M. Fernández Branislava Gemović Vladimir Perović Radoslav Davidović Neven Šumonja Nevena Veljković Ehsaneddin Asgari Mohammad R. K. Mofrad Giuseppe Profiti Castrense Savojardo Pier Luigi Martelli Rita Casadio Florian Boecker Heiko Schoof Indika Kahanda Natalie Thurlby Alice C. McHardy Alexandre Renaux Rabie Saidi Julian Gough Alex A. Freitas Magdalena Antczak Fábio Fabris Mark N. Wass Jie Hou Jianlin Cheng Zheng Wang Alfonso E. Romero Alberto Paccanaro Haixuan Yang Tatyana Goldberg Chenguang Zhao Liisa Holm Petri Törönen Alan Medlar Elaine Zosa Itamar Borukhov Ilya B. Novikov Angela D. Wilkins Olivier Lichtarge Po-Han Chi Wei-Cheng Tseng Michal Linial Peter W. Rose Christophe Dessimoz Vedrana Vidulin Sašo Džeroski Ian Sillitoe Sayoni Das Jonathan Lees David T. Jones Cen Wan Domenico Cozzetto Rui Fa Mateo Torres Alex Warwick Vesztrocy José Manuel Rodrı́guez Michael L. Tress Marco Frasca Marco Notaro Giuliano Grossi Alessandro Petrini Matteo Ré Giorgio Valentini Marco Mesiti Daniel B. Roche Jonas Reeb David W. Ritchie Sabeur Aridhi Seyed Ziaeddin Alborzi Marie‐Dominique Devignes Da Chen Emily Koo Richard Bonneau Vladimir Gligorijević Meet Barot Hai Fang Stefano Toppo Enrico Lavezzo

Abstract Background The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation protein function. Results Here, we report on results third CAFA challenge, CAFA3, that featured expanded analysis over previous rounds, both in terms volume data analyzed types performed. In a novel major new development, predictions assessment goals drove some experimental assays, resulting functional annotations for...

10.1186/s13059-019-1835-8 article EN cc-by Genome biology 2019-11-19
Elisabeth Coudert Sébastien Géhant Edouard de Castro Monica Pozzato Delphine Baratin and 95 more Teresa Batista Neto Christian Sigrist Nicole Redaschi Alan Bridge Alan Bridge Lucila Aimo Ghislaine Argoud‐Puy Andrea Auchincloss Kristian B. Axelsen Parit Bansal Delphine Baratin Teresa M Batista Neto Marie-Claude Blatter Jerven Bolleman Emmanuel Boutet Lionel Breuza Blanca Cabrera Gil Cristina Casals‐Casas Kamal Chikh Echioukh Elisabeth Coudert Beatrice Cuche Edouard de Castro Anne Estreicher Maria Livia Famiglietti Marc Feuermann Elisabeth Gasteiger Pascale Gaudet Sébastien Géhant Vivienne Baillie Gerritsen Arnaud Gos Nadine Gruaz Chantal Hulo Nevila Hyka‐Nouspikel Florence Jungo Arnaud Kerhornou Philippe Le Mercier Damien Lieberherr Patrick Masson Anne Morgat Venkatesh Muthukrishnan Salvo Paesano Ivo Pedruzzi Sandrine Pilbout Lucille Pourcel Sylvain Poux Monica Pozzato Manuela Pruess Nicole Redaschi Catherine Rivoire Christian Sigrist Karin Sonesson Shyamala Sundaram Alex Bateman María Martin Sandra Orchard Michele Magrane Shadab Ahmad Emanuele Alpi Emily Bowler-Barnett Ramona Britto Hema Bye- A-Jee Austra Cukura Paul Denny Tunca Doğan ThankGod E. Ebenezer Jun Fan Penelope Garmiri Leonardo Jose da Costa Gonzales Emma Hatton-Ellis Abdulrahman Hussein Alexandr Ignatchenko Giuseppe Insana Rizwan Ishtiaq Vishal Joshi Dushyanth Jyothi Swaathi Kandasaamy Antonia Lock Aurélien Luciani Marija Lugaric Jie Luo Yvonne Lussi Alistair MacDougall Fábio Madeira Mahdi Mahmoudy Alok Mishra Katie Moulang Andrew Nightingale Sangya Pundir Guoying Qi Shriya Raj Pedro Raposo Daniel L Rice Rabie Saidi Rafael Santos Elena Speretta

Abstract Motivation To provide high quality, computationally tractable annotation of binding sites for biologically relevant (cognate) ligands in UniProtKB using the chemical ontology ChEBI (Chemical Entities Biological Interest), to better support efforts study and predict functionally interactions between protein sequences structures small molecule ligands. Results We structured data model cognate ligand site annotations performed a complete reannotation all stable unique identifiers from...

10.1093/bioinformatics/btac793 article EN cc-by Bioinformatics 2022-12-08

Life sciences are yielding huge data sets that underpin scientific discoveries fundamental to improvement in human health, agriculture and the environment. In support of these discoveries, a plethora databases tools deployed, technically complex diverse implementations, across spectrum disciplines. The corpus documentation resources is fragmented Web, with much redundancy, has lacked common standard information. outcome scientists must often struggle find, understand, compare use best for...

10.1093/nar/gkv1116 article EN cc-by Nucleic Acids Research 2015-11-03
Alistair MacDougall Vladimir Volynkin Rabie Saidi Diego Poggioli Hermann Zellner and 95 more Emma Hatton-Ellis Vishal Joshi Claire O’Donovan Sandra Orchard Andrea Auchincloss Delphine Baratin Jerven Bolleman Elisabeth Coudert Leyla Jael Castro Chantal Hulo Patrick Masson Ivo Pedruzzi Catherine Rivoire Cecilia Arighi Qinghua Wang Chuming Chen Hongzhan Huang John S. Garavelli C R Vinayaka Lai-Su Yeh Darren A. Natale Kati Laiho María Martin Alexandre Renaux Klemens Pichler Alex Bateman Alan Bridge Cathy Wu Cecilia Arighi Lionel Breuza Elisabeth Coudert Hongzhan Huang Damien Lieberherr Michele Magrane María Martin Peter B. McGarvey Darren A. Natale Sandra Orchard Ivo Pedruzzi Sylvain Poux Manuela Pruess Shriya Raj Nicole Redaschi Lucila Aimo Ghislaine Argoud‐Puy Andrea Auchincloss Kristian B. Axelsen Emmanuel Boutet Emily Bowler-Barnett Ramona Britto Hema Bye‐A‐Jee Cristina Casals‐Casas Paul Denny Anne Estreicher Maria Livia Famiglietti Marc Feuermann John S. Garavelli Penelope Garmiri Arnaud Gos Nadine Gruaz Emma Hatton-Ellis Chantal Hulo Nevila Hyka‐Nouspikel Florence Jungo Kati Laiho Philippe Le Mercier Antonia Lock Yvonne Lussi Alistair MacDougall Patrick Masson Anne Morgat Sandrine Pilbout Lucille Pourcel Catherine Rivoire Karen Ross Christian Sigrist Elena Speretta Shyamala Sundaram Nidhi Tyagi C R Vinayaka Qinghua Wang Kate Warner Lai-Su Yeh Rossana Zaru Shadab Ahmed Emanuele Alpi Leslie Arminski Parit Bansal Delphine Baratin Teresa Batista Neto Jerven Bolleman Chuming Chen Chuming Chen Beatrice Cuche Austra Cukura

Abstract Motivation The number of protein records in the UniProt Knowledgebase (UniProtKB: https://www.uniprot.org) continues to grow rapidly as a result genome sequencing and prediction protein-coding genes. Providing functional annotation for these proteins presents significant continuing challenge. Results In response this challenge, has developed method annotation, known UniRule, based on expertly curated rules, which integrates related systems (RuleBase, HAMAP, PIRSR, PIRNR) by members...

10.1093/bioinformatics/btaa485 article EN cc-by Bioinformatics 2020-05-05

Abstract Motivation: Similarity-based methods have been widely used in order to infer the properties of genes and gene products containing little or no experimental annotation. New approaches that overcome limitations rely solely upon sequence similarity are attracting increased attention. One these novel is use organization structural domains proteins. Results: We propose a method for automatic annotation protein sequences UniProt Knowledgebase (UniProtKB) by comparing their domain...

10.1093/bioinformatics/btw114 article EN cc-by Bioinformatics 2016-03-07

Abstract Background This paper deals with the preprocessing of protein sequences for supervised classification. Motif extraction is one way to address that task. It has been largely used encode biological into feature vectors enable using well-known machine-learning classifiers which require this format. However, designing a suitable space, set proteins, not trivial For purpose, we propose novel encoding method uses amino-acid substitution matrices define similarity between motifs during...

10.1186/1471-2105-11-175 article EN cc-by BMC Bioinformatics 2010-04-08
Naihui Zhou Yuxiang Jiang Timothy Bergquist Alexandra Lee Balint Z. Kacsoh and 95 more Alex W. Crocker Kimberley A. Lewis George P. Georghiou Huy Nguyen Md-Nafiz Hamid L. Taylor Davis Tunca Doğan Volkan Atalay Ahmet Süreyya Rifaioğlu Alperen Dalkıran Rengül Çetin-Atalay Chengxin Zhang Rebecca L. Hurto Peter L. Freddolino Yang Zhang Prajwal Bhat Fran Supek José M. Fernández Branislava Gemović Vladimir Perović Radoslav Davidović Neven Šumonja Nevena Veljković Ehsaneddin Asgari Mohammad RK Mofrad Giuseppe Profiti Castrense Savojardo Pier Luigi Martelli Rita Casadio Florian Boecker Indika Kahanda Natalie Thurlby Alice C. McHardy Alexandre Renaux Rabie Saidi Julian Gough Alex A. Freitas Magdalena Antczak Fábio Fabris Mark N. Wass Jie Hou Jianlin Cheng Jie Hou Zheng Wang Alfonso E. Romero Alberto Paccanaro Haixuan Yang Tatyana Goldberg Chenguang Zhao Liisa Holm Petri Törönen Alan Medlar Elaine Zosa Itamar Borukhov Ilya B. Novikov Angela D. Wilkins Olivier Lichtarge Po-Han Chi Wei-Cheng Tseng Michal Linial Peter W. Rose Christophe Dessimoz Vedrana Vidulin Sašo Džeroski Ian Sillitoe Sayoni Das Jonathan Lees David T. Jones Cen Wan Domenico Cozzetto Rui Fa Mateo Torres Alex Wiarwick Vesztrocy José Manuel Rodrı́guez Michael L. Tress Marco Frasca Marco Notaro Giuliano Grossi Alessandro Petrini Matteo Ré Giorgio Valentini Marco Mesiti Daniel B. Roche Jonas Reeb David W. Ritchie Sabeur Aridhi Seyed Ziaeddin Alborzi Marie‐Dominique Devignes Da Chen Emily Koo Richard Bonneau Vladimir Gligorijević Meet Barot Hai Fang Stefano Toppo Enrico Lavezzo

Abstract The Critical Assessment of Functional Annotation (CAFA) is an ongoing, global, community-driven effort to evaluate and improve the computational annotation protein function. Here we report on results third CAFA challenge, CAFA3, that featured expanded analysis over previous rounds, both in terms volume data analyzed types performed. In a novel major new development, predictions assessment goals drove some experimental assays, resulting functional annotations for more than 1000...

10.1101/653105 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2019-05-29

Abstract The use of raw amino acid sequences as input for deep learning models protein functional prediction has gained popularity in recent years. This scheme obliges to manage proteins with different lengths, while require same-shape input. To accomplish this, zeros are usually added each sequence up a established common length process called zero-padding. However, the effect padding strategies on model performance and data structure is yet unknown. We propose implement four novel types...

10.1038/s41598-020-71450-8 article EN cc-by Scientific Reports 2020-09-03

Abstract Systemic analysis of available large-scale biological/biomedical data is critical for studying biological mechanisms, and developing novel effective treatment approaches against diseases. However, different layers the are produced using technologies scattered across individual computational resources without any explicit connections to each other, which hinders extensive integrative multi-omics-based analysis. We aimed address this issue by a new integration/representation...

10.1093/nar/gkab543 article EN cc-by Nucleic Acids Research 2021-06-11

The increasing growth of databases raises an urgent need for more accurate methods to better understand the stored data. In this scope, association rules were extensively used analysis and comprehension huge amounts However, number generated is too large be efficiently analyzed explored in any further process. order bypass hamper, efficient selection has performed. Since necessarily based on evaluation, many interestingness measures have been proposed. abundance these gave rise a new...

10.1142/s0218213014600112 article EN International Journal of Artificial Intelligence Tools 2014-08-01

The huge number of association rules represent the main hamper that a decision maker faces. In order to bypass this hamper, an efficient selection has be performed. Since is necessarily based on evaluation, many interestingness measures have been proposed. However, abundance these gave rise new problem, namely heterogeneity evaluation results and created confusion decision. respect, we propose novel approach discover interesting without favoring or excluding any measure by adopting notion...

10.1109/ictai.2012.94 preprint EN 2012-11-01

Experimental testing and manual curation are the most precise ways for assigning Gene Ontology (GO) terms describing protein functions. However, they expensive, time-consuming cannot cope with exponential growth of data generated by high-throughput sequencing methods. Hence, researchers need reliable computational systems to help fill gap automatic function prediction. The results last Critical Assessment Function Annotation challenge revealed that GO-terms prediction remains a very...

10.1093/bioinformatics/btac536 article EN cc-by Bioinformatics 2022-08-05

The cherty rocks of the Chouabine Formation Gafsa-Metlaoui basin (south-western Tunisia), that is composed by biogenic silica, are treated using thermal treatment at 1000°C with flux calcination method in order to prepare a specific filter aids melting sulfur used for production sulfuric acid. This work presents effect heating on granulometry chert. mineralogical composition natural starting chert opal CT (cristobalite/tridymite) and mineral mixture quartz, smectite clay minerals,...

10.1088/1757-899x/28/1/012027 article EN IOP Conference Series Materials Science and Engineering 2012-02-07

One of the most powerful techniques to study proteins is look for recurrent fragments (also called substructures), then use them as patterns characterize under study. Although protein sequences have been extensively studied in literature, studying three-dimensional (3D) structures can reveal relevant structural and functional information that may not be derived from alone. An emergent trend consists parsing 3D into graphs amino acids. Hence, search substructures formulated a process frequent...

10.1089/cmb.2013.0092 article EN Journal of Computational Biology 2013-10-13

Abstract Recent advances in computing power and machine learning empower functional annotation of protein sequences their transcript variations. Here, we present an automated prediction system UniGOPred, for GO annotations a database term predictions proteomes several organisms UniProt Knowledgebase (UniProtKB). UniGOPred provides function 514 molecular (MF), 2909 biological process (BP), 438 cellular component (CC) terms each sequence. covers nearly the whole functionality spectrum Gene...

10.1002/prot.25416 article EN Proteins Structure Function and Bioinformatics 2017-11-03

The widening gap between known proteins and their functions has encouraged the development of methods to automatically infer annotations. Automatic functional annotation is expected meet conflicting requirements maximizing coverage, while minimizing erroneous assignments. This trade-off imposes a great challenge in designing intelligent systems tackle problem automatic protein annotation. In this work, we present system that utilizes rule mining techniques predict metabolic pathways...

10.1371/journal.pone.0158896 article EN cc-by PLoS ONE 2016-07-08

Feature extraction is an unavoidable task, especially in the critical step of preprocessing biological sequences. This consists for example transforming sequences into vectors motifs where each motif a subsequence that can be seen as property (or attribute) characterizing sequence. Hence, we obtain object-property table objects are and properties extracted from output used to apply standard machine learning tools perform data mining tasks such classification. Several previous works have...

10.1145/2382936.2383060 preprint EN 2012-10-07
Leyla García Jerven Bolleman Sébastien Géhant Nicole Redaschi María Martin and 95 more Alex Bateman Michele Magrane María Martin Sandra Orchard Shriya Raj Shadab Ahmad Emanuele Alpi Emily Bowler-Barnett Ramona Britto Borisas Bursteinas Hema Bye‐A‐Jee Tunca Doğan Leyla García Penelope Garmiri George P. Georghiou Leonardo Jose da Costa Gonzales Emma Hatton-Ellis Alexandr Ignatchenko Giuseppe Insana Rizwan Ishtiaq Vishal Joshi Dushyanth Jyothi Jie Luo Yvonne Lussi Alistair MacDougall Mahdi Mahmoudy Andrew Nightingale Carla Oliveira Joseph Onwubiko Vivek Poddar Sangya Pundir Guoying Qi Ahmet Süreyya Rifaioğlu Daniel L Rice Rabie Saidi Elena Speretta E. B. Turner Nidhi Tyagi Preethi Vasudev Vladimir Volynkin Kate Warner Xavier Watkins Rossana Zaru Hermann Zellner Alan Bridge Lionel Breuza Elisabeth Coudert Damien Lieberherr Ivo Pedruzzi Sylvain Poux Manuela Pruess Nicole Redaschi Lucila Aimo Ghislaine Argoud‐Puy Andrea Auchincloss Kristian B. Axelsen Parit Bansal Delphine Baratin Teresa Batista Neto Marie-Claude Blatter Jerven Bolleman Emmanuel Boutet Cristina Casals‐Casas Beatrice Cuche Leyla Jael Castro Anne Estreicher L. Famiglietti Marc Feuermann Elisabeth Gasteiger Sébastien Géhant Vivienne Baillie Gerritsen Arnaud Gos Nadine Gruaz Ursula Hinz Chantal Hulo Nevila Hyka‐Nouspikel Florence Jungo Arnaud Kerhornou Philippe Le Mercier Thierry Lombardot Patrick Masson Anne Morgat Sandrine Pilbout Monica Pozzato Catherine Rivoire Christian Sigrist Shyamala Sundaram Cathy Wu Cecilia Arighi Hongzhan Huang Peter B. McGarvey Darren A. Natale Leslie Arminski Chuming Chen Chuming Chen

UniProt continues to support the ongoing process of making scientific data FAIR. Here we contribute this with a FAIRness assessment our UniProtKB dataset followed by critical reflection on challenges and future directions adoption validation FAIR principles metrics.

10.1038/s41597-019-0180-9 article EN cc-by Scientific Data 2019-09-20

Recently, the principles of graph theory are being adopted to address molecular and chemical structures investigations such as 3D protein structure prediction spatial motifs discovery. Proteins have been parsed into graphs according several approaches methods then studied based on concepts data mining tools. In this paper we make a brief survey most used graph-based representations propose naïve method help with making since key step valuable process is build concise correct holding reliable...

10.1145/1562090.1562098 preprint EN 2009-06-28
Coming Soon ...