- Topic Modeling
- Natural Language Processing Techniques
- Semantic Web and Ontologies
- Advanced Text Analysis Techniques
- Computational and Text Analysis Methods
- Biomedical Text Mining and Ontologies
- Sentiment Analysis and Opinion Mining
- Data Quality and Management
- Wikis in Education and Collaboration
University of Illinois Urbana-Champaign
2020-2025
<title>Abstract</title> Prior analyses and assessments of the impact scientific research has mainly relied on analyzing its scope within academia influence scholarly circles. However, by not considering broader societal, economic, policy implications projects, these studies overlook ways in which discoveries contribute to technological innovation, public health improvements, environmental sustainability, other areas real-world application. We expand upon this prior work developing validating...
Abstract Automated text categorization methods are of broad relevance for domain experts since they free researchers and practitioners from manual labeling, save their resources (e.g., time, labor), enrich the data with information helpful to study substantive questions. Despite a variety newly developed that require substantial amounts annotated data, little is known about how build models when (a) labeling texts categories requires expertise and/or in‐depth reading, (b) only few documents...
Prompt-based fine-tuning has become an essential method for eliciting information encoded in pre-trained language models a variety of tasks, including text classification. For multi-class classification prompt-based under low-resource scenarios resulted performance levels comparable to those fully methods. Previous studies have used crafted prompt templates and verbalizers, mapping from the label terms space class space, solve problem as masked modeling task. However, cross-domain...
Hierarchical domain-specific classification schemas (or subject heading vocabularies) are often used to identify, classify, and disambiguate concepts that occur in scholarly articles. In this work, we develop, apply, evaluate a human-in-the-loop workflow first extracts an initial category tree from crowd-sourced Wikipedia data, then combines community detection, machine learning, hand-crafted heuristics or rules prune the tree. This work resulted WikiCSSH; large-scale, hierarchically...