Tiago Lubiana

ORCID: 0000-0003-2473-2313
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Biomedical Text Mining and Ontologies
  • Wikis in Education and Collaboration
  • Semantic Web and Ontologies
  • Topic Modeling
  • Advanced Graph Neural Networks
  • Natural Language Processing Techniques
  • SARS-CoV-2 and COVID-19 Research
  • Genetics, Bioinformatics, and Biomedical Research
  • Bioinformatics and Genomic Networks
  • Genomics and Phylogenetic Studies
  • Mitochondrial Function and Pathology
  • Anesthesia and Neurotoxicity Research
  • Artificial Intelligence in Healthcare and Education
  • Research Data Management Practices
  • Cell Image Analysis Techniques
  • Microbial Metabolic Engineering and Bioproduction
  • Hybrid Renewable Energy Systems
  • Hereditary Neurological Disorders
  • Scientific Computing and Data Management
  • Microbial Community Ecology and Physiology
  • COVID-19 Clinical Research Studies
  • Cancer-related gene regulation
  • Academic Publishing and Open Access
  • Bone and Dental Protein Studies
  • Vaccine Coverage and Hesitancy

Universidade de São Paulo
2016-2024

Ronin Institute
2021-2023

Institute of Mathematics and Informatics
2022

Czech Academy of Sciences, Institute of Mathematics
2022

Universidade Federal do Rio de Janeiro
2021

Instituto Biológico
2018

Universidade Federal de São Paulo
2016

University of California, San Diego
2016

The rise of advanced chatbots, such as ChatGPT, has stirred excitement and curiosity in the scientific community.Powered by large language models (LLMs) based on generative pretrained transformers (GPTs)-specifically GPT-3.5 GPT-4-ChatGPT is considered a general-purpose technology with potential to impact job market research endeavors numerous fields [1].Although similar have been fine-tuned for biology-specific projects, including text-based analysis biological sequence decoding [2,3],...

10.1371/journal.pcbi.1011319 article EN cc-by PLoS Computational Biology 2023-08-10

Bacterial type IV secretion systems (T4SS) are a highly diversified but evolutionarily related family of macromolecule transporters that can secrete proteins and DNA into the extracellular medium or target cells. It was recently shown subtype T4SS harboured by plant pathogen Xanthomonas citri transfers toxins Here, we show similar from multi-drug-resistant opportunistic Stenotrophomonas maltophilia is proficient in killing competitor bacterial species. T4SS-dependent duelling between S. X....

10.1371/journal.ppat.1007651 article EN cc-by PLoS Pathogens 2019-09-12

The Complex Portal (www.ebi.ac.uk/complexportal) is a manually curated, encyclopaedic database of macromolecular complexes with known function from range model organisms. It summarizes complex composition, topology and along links to large domain-specific resources (i.e. wwPDB, EMDB Reactome). Since the last update in 2019, we have produced first draft complexome for Escherichia coli, maintained updated that Saccharomyces cerevisiae, added over 40 coronavirus increased human 1100 include...

10.1093/nar/gkab991 article EN cc-by Nucleic Acids Research 2021-10-10

Ontologies are fundamental components of informatics infrastructure in domains such as biomedical, environmental, and food sciences, representing consensus knowledge an accurate computable form. However, their construction maintenance demand substantial resources necessitate collaboration between domain experts, curators, ontology experts. We present Dynamic Retrieval Augmented Generation using AI (DRAGON-AI), generation method employing Large Language Models (LLMs) (RAG). DRAGON-AI can...

10.1186/s13326-024-00320-3 article EN cc-by Journal of Biomedical Semantics 2024-10-16

The novel coronavirus SARS-CoV-2, which emerged in late 2019, has since spread around the world and infected hundreds of millions people with disease 2019 (COVID-19). While this viral species was unknown prior to January 2020, its similarity other coronaviruses that infect humans allowed for rapid insight into mechanisms it uses human hosts, as well ways immune system can respond. Here, we contextualize SARS-CoV-2 among identify what is known be inferred about behavior once inside a host....

10.1128/msystems.00095-21 article EN cc-by mSystems 2021-10-26

Information related to the COVID-19 pandemic ranges from biological bibliographic, geographical genetic and beyond. The structure of raw data is highly complex, so converting it meaningful insight requires curation, integration, extraction visualization, global crowdsourcing which provides both additional challenges opportunities. Wikidata an interdisciplinary, multilingual, open collaborative knowledge base more than 90 million entities connected by well over a billion relationships. It...

10.3233/sw-210444 article EN other-oa Semantic Web 2021-09-28

The standardized identification of biomedical entities is a cornerstone interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for such as small molecules, proteins, cell lines, clinical trials. However, existing struggled provide sufficient coverage metadata standards that meet evolving needs modern sciences researchers. Here, we introduce Bioregistry, an integrative, open, community-driven...

10.1038/s41597-022-01807-3 article EN cc-by Scientific Data 2022-11-19

The rise of advanced chatbots, such as ChatGPT, has sparked curiosity in the scientific community. ChatGPT is a general-purpose chatbot powered by large language models (LLMs) GPT-3.5 and GPT-4, with potential to impact numerous fields, including computational biology. In this article, we offer ten tips based on our experience assist biologists optimizing their workflows. We have collected relevant prompts reviewed nascent literature field, compiling project remain pertinent for future LLM...

10.48550/arxiv.2303.16429 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

Preprints have been increasingly used in biomedical science, and a key feature of many platforms is public commenting. The content these comments, however, has not well studied, it unclear whether they resemble those found journal peer review.To describe the comments on bioRxiv medRxiv preprint platforms.In this cross-sectional study, preprints posted 2020 were accessed through each platform's application programming interface March 29, 2021, random sample containing between 1 20 was...

10.1001/jamanetworkopen.2023.31410 article EN cc-by-nc-nd JAMA Network Open 2023-08-30

This article acts as a successor to the 10 simple rules for editing Wikipedia from decade ago [1].It addresses Wikipedia's machine-readable cousin: Wikidata-a project potentially even more relevant point of view Computational Biology.Wikidata is free collaborative knowledgebase [2] providing structured data every page and beyond.It relies on same peer production principle Wikipedia: anyone can contribute.Open, models often surprise in how productively they work practice, given unlikely might...

10.1371/journal.pcbi.1011235 article EN cc-by PLoS Computational Biology 2023-07-20

Abstract Introduction Preprints have been increasingly used in biomedical sciences, providing the opportunity for research to be publicly assessed before journal publication. With increase attention over preprints during COVID-19 pandemic, we decided assess content of comments left on preprint platforms. Methods posted bioRxiv and medRxiv 2020 were accessed through each platform’s API, a random sample that had received between 1 20 was analyzed. Comments evaluated triplicate by independent...

10.1101/2022.11.23.517621 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-11-24

Urgent global research demands real-time dissemination of precise data. Wikidata, a collaborative and openly licensed knowledge graph available in RDF format, provides an ideal forum for exchanging structured data that can be verified consolidated using validation schemas bot edits. In this article, we catalog automatable task set necessary to assess validate the portion Wikidata relating COVID-19 epidemiology. These tasks statistical are implemented SPARQL, query language semantic...

10.7717/peerj-cs.1085 article EN cc-by PeerJ Computer Science 2022-09-29

The recombinant proteins, spider silk proteins and enzybiotics, will be expressed in Chlamydomonas reinhardtii strains by nuclear transformation. Each strain express a different protein, which contain the N- C-terminal polymerization domains from native proteins. These are essential to step and, subsequently, for production of material very similar silk. This evaluated regarding its antimicrobial mechanical properties, as well system productivity. results may shed some light on silk-based...

10.3897/rio.2.e9342 article EN cc-by Research Ideas and Outcomes 2016-06-23

Throughout the global coronavirus pandemic, we have seen an unprecedented volume of COVID-19 researchpublications. This vast body evidence continues to grow, making it difficult for research users keep up with pace evolving findings. To enable synthesis this timely use by researchers, policymakers, and other stakeholders, developed automated workflow collect, categorise, visualise from primary studies. We trained a crowd volunteer reviewers annotate studies relevance COVID-19, study...

10.32384/jeahil17465 article EN cc-by Journal of EAHIL 2021-06-24

Wikipedia is one of the most important channels for public communication science and frequently accessed as an educational resource in computational biology. Joint efforts between International Society Computational Biology (ISCB) taskforce WikiProject Molecular (a group expert editors) have considerably improved biology representation on recent years. However, there still urgent need further improvement quality, especially when compared to related scientific fields such genetics medicine....

10.1093/bioinformatics/btac236 article EN Bioinformatics 2022-04-14

Abstract PanglaoDB is a database of cell-type markers widely used for single-cell RNA sequencing data analysis. However, cell types and genes in the are encoded by free text, lacking proper identifiers. Wikidata, freely editable knowledge graph useful integrating biomedical knowledge. We thus reasoned that porting PanglaoDB’s to platform could improve their reusability overall technical quality (FAIRness). mapped 188 from species-neutral terms on Wikidata created 376 species-specific Homo...

10.1101/2024.04.12.589259 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2024-04-15

10.1016/j.lana.2024.100795 article EN cc-by-nc The Lancet Regional Health - Americas 2024-05-28

Ontologies are fundamental components of informatics infrastructure in domains such as biomedical, environmental, and food sciences, representing consensus knowledge an accurate computable form. However, their construction maintenance demand substantial resources, necessitating collaborative efforts domain experts, curators, ontology experts. We present Dynamic Retrieval Augmented Generation using AI (DRAGON-AI), generation method employing Large Language Models (LLMs) (RAG). This can...

10.48550/arxiv.2312.10904 preprint EN cc-by arXiv (Cornell University) 2023-01-01

ABSTRACT The standardized identification of biomedical entities is a cornerstone interoperability, reuse, and data integration in the life sciences. Several registries have been developed to catalog resources maintaining identifiers for such as small molecules, proteins, cell lines, clinical trials. However, existing struggled provide sufficient coverage metadata standards that meet evolving needs modern sciences researchers. Here, we introduce Bioregistry, an integrative, open,...

10.1101/2022.07.08.499378 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-07-10
Coming Soon ...