A natural language processing system for the efficient updating of highly curated pathophysiology mechanism knowledge graphs
Relevance
ENCODE
Identification
Biomedical text mining
Named Entity Recognition
DOI:
10.1016/j.ailsci.2023.100078
Publication Date:
2023-06-16T04:57:57Z
AUTHORS (14)
ABSTRACT
Biomedical knowledge graphs (KG) have become crucial for describing biological findings in a structured manner. To keep up with the constantly changing flow of knowledge, their embedded information must be regularly updated latest findings. Natural language processing (NLP) has created new possibilities automating this upkeep by facilitating extraction from free text. However, due to annotated and labeled biomedical data limitations, development completely autonomous systems remains substantial scientific technological hurdle. This study aims explore methodologies best suited support automatic causal relationships literature aim regular rapid updating disease-specific pathophysiology mechanism KGs. Our proposed approach first searches retrieves PubMed abstracts using desired terms keywords. The extension corpora are then passed through NLP pipeline extraction. We identify triples representing cause-and-effect encode content Biological Expression Language (BEL). Finally, domain experts perform an analysis completeness, relevance, accuracy, novelty extracted triples. In our test scenario, which is focused on KG regarding phosphorylation Tau protein, successfully contributed novel data, was subsequently used update leading identification six additional upstream regulators phosphorylation. Here, it demonstrated that NLP-based workflow we capable rapidly graphs. As result, production-scale, semi-automated pre-existing, curated enabled.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (112)
CITATIONS (1)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....