OrganismTagger: detection, normalization and grounding of organism entities in biomedical documents
Named Entity Recognition
DOI:
10.1093/bioinformatics/btr452
Publication Date:
2011-08-10T00:42:31Z
AUTHORS (4)
ABSTRACT
Abstract Motivation: Semantic tagging of organism mentions in full-text articles is an important part literature mining and semantic enrichment solutions. Tagged also play a pivotal role disambiguating other entities text, such as proteins. A high-precision system must be able to detect the numerous forms mentions, including common names well traditional taxonomic groups: genus, species strains. In addition, resolve abbreviations acronyms, assign scientific name if possible link detected mention NCBI Taxonomy database for further queries navigation. Results: We present OrganismTagger, hybrid rule-based/machine learning extract from literature. It includes tools automatically generating lexical ontological resources copy database, thereby facilitating updates by end users. Its novel ontology-based can reused linked data tasks. Each normalized canonical through resolution acronyms subsequently grounded with ID. particular, our combines machine-learning approach rule-based methods detecting strain documents. On manually annotated OT corpus, OrganismTagger achieves precision 95%, recall 94% grounding accuracy 97.5%. corpus Linnaeus-100, results show 99%, 97% 97.4%. Availability: The supporting tools, resources, training manual annotations, user developer documentation, freely available under open-source license at http://www.semanticsoftware.info/organism-tagger. Contact: witte@semanticsoftware.info
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (21)
CITATIONS (33)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....