Deep active learning for classifying cancer pathology reports

Active learning QH301-705.5 Computer applications to medicine. Medical informatics R858-859.7 Deep learning 3. Good health Machine Learning Neoplasms Text classification Humans Convolutional neural networks Neural Networks, Computer Biology (General) Cancer pathology reports Algorithms Research Article
DOI: 10.1186/s12859-021-04047-1 Publication Date: 2021-03-09T09:03:37Z
ABSTRACT
Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep models is often difficult expensive. Active techniques may mitigate this challenge by reducing amount of required to effectively train a model. In study, we analyze effectiveness 11 active algorithms on classifying subsite histology from cancer pathology reports using Convolutional Neural Network as model.We compare performance each strategy two differently sized datasets different tasks. Our results show that all tasks dataset sizes, strategies except diversity-sampling outperformed random sampling, i.e., no learning. On our large (15K initial samples, adding 15K additional samples iteration learning), there was clear winner between strategies. small (1K 1K marginal ratio uncertainty sampling performed better than other techniques. We found compared strongly helps rare classes focusing underrepresented classes.Active can save annotation cost helping human annotators efficiently intelligently select which label. constructed effective requires less half achieve same sampling.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (31)
CITATIONS (24)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....