Deep active learning for classifying cancer pathology reports
Active learning
QH301-705.5
Computer applications to medicine. Medical informatics
R858-859.7
Deep learning
3. Good health
Machine Learning
Neoplasms
Text classification
Humans
Convolutional neural networks
Neural Networks, Computer
Biology (General)
Cancer pathology reports
Algorithms
Research Article
DOI:
10.1186/s12859-021-04047-1
Publication Date:
2021-03-09T09:03:37Z
AUTHORS (12)
ABSTRACT
Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep models is often difficult expensive. Active techniques may mitigate this challenge by reducing amount of required to effectively train a model. In study, we analyze effectiveness 11 active algorithms on classifying subsite histology from cancer pathology reports using Convolutional Neural Network as model.We compare performance each strategy two differently sized datasets different tasks. Our results show that all tasks dataset sizes, strategies except diversity-sampling outperformed random sampling, i.e., no learning. On our large (15K initial samples, adding 15K additional samples iteration learning), there was clear winner between strategies. small (1K 1K marginal ratio uncertainty sampling performed better than other techniques. We found compared strongly helps rare classes focusing underrepresented classes.Active can save annotation cost helping human annotators efficiently intelligently select which label. constructed effective requires less half achieve same sampling.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (31)
CITATIONS (24)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....