- Topic Modeling
- Biomedical Text Mining and Ontologies
- Natural Language Processing Techniques
- Machine Learning in Healthcare
Georgia Institute of Technology
2023
David Kartchner, Jennifer Deng, Shubham Lohiya, Tejasri Kopparthi, Prasanth Bathala, Daniel Domingo-Fernández, Cassie Mitchell. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 2023.
This work presents a new, original document classification dataset, BioSift, to expedite the initial selection and labeling of studies for drug repurposing. The dataset consists 10,000 human-annotated abstracts from scientific articles in PubMed. Each abstract is labeled with up eight attributes necessary perform meta-analysis utilizing popular patient-intervention-comparator-outcome (PICO) method: has human subjects, clinical trial/cohort, population size, target disease, study drug,...