A dictionary to identify small molecules and drugs in free text

DrugBank Identification Data dictionary
DOI: 10.1093/bioinformatics/btp535 Publication Date: 2009-09-17T01:30:31Z
ABSTRACT
From the scientific community, a lot of effort has been spent on correct identification gene and protein names in text, while less chemical names. Dictionary-based term power to recognize diverse representation information literature map chemicals their database identifiers.We developed dictionary for small molecules drugs combining from UMLS, MeSH, ChEBI, DrugBank, KEGG, HMDB ChemIDplus. Rule-based filtering, manual check highly frequent terms disambiguation rules were applied. We tested combined dictionaries derived individual resources an annotated corpus, conclude following: (i) each different processing steps increase precision with minor loss recall; (ii) overall performance is acceptable (precision 0.67, recall 0.40 (0.80 trivial names); (iii) performed better than recognizer OSCAR3; (iv) based ChemIDplus alone comparable dictionary.The freely available as XML file Simple Knowledge Organization System format web site http://www.biosemantics.org/chemlist.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (55)
CITATIONS (117)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....