NFDI4DS | UHH-SEMS - Publication Details

Jack H. Culbert

ORCID: 0009-0000-1581-4021

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5092442081

Research Areas

Research Data Management Practices
Topic Modeling
scientometrics and bibliometrics research
Diverse Approaches in Healthcare and Education Studies
Data Quality and Management
Biomedical Text Mining and Ontologies
Natural Language Processing Techniques
Text and Document Classification Technologies
Advanced Text Analysis Techniques
Social Media and Politics

Originality in scientific titles and abstracts can predict citation count

OPENALEX - Publications

Jack H. Culbert Yoed N. Kenett Philipp Mayr

In this research-in-progress paper, we apply a computational measure correlating with originality from creativity science: Divergent Semantic Integration (DSI), to selection of 99,557 scientific abstracts and titles selected the Web Science. We observe statistically significant differences in DSI between subject field research, slight rise over time. model base 10 logarithm citation count after 5 years find positive correlation all fields research an adjusted $R^2$ 0.13.

10.48550/arxiv.2502.01417 preprint EN arXiv (Cornell University) 2025-02-03

Reference coverage analysis of OpenAlex compared to Web of Science and Scopus

OPENALEX - Publications

Jack H. Culbert Anne Hobert Najko Jahn Nick Haupka Marion Schmidt and 2 more

Abstract OpenAlex is a promising open source of scholarly metadata, and competitor to established proprietary sources, such as the Web Science Scopus. As provides its data freely openly, it permits researchers perform bibliometric studies that can be reproduced in community without licensing barriers. However, rapidly evolving contained within expanding also quickly changing, question naturally arises trustworthiness data. In this report, we will study reference coverage selected metadata...

10.1007/s11192-025-05293-3 article EN cc-by Scientometrics 2025-04-10

Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus

OPENALEX - Publications

Jack H. Culbert Anne Hobert Najko Jahn Nick Haupka Marion Schmidt and 2 more

OpenAlex is a promising open source of scholarly metadata, and competitor to the established proprietary sources, Web Science Scopus. As provides its data freely openly, it permits researchers perform bibliometric studies that can be reproduced in community without licensing barriers. However, as rapidly evolving contained within expanding also quickly changing, question naturally arises trustworthiness data. In this empirical paper, we will study reference metadata coverage each database...

10.48550/arxiv.2401.16359 preprint EN arXiv (Cornell University) 2024-01-29

Analysis of the Publication and Document Types in OpenAlex, Web of Science, Scopus, Pubmed and Semantic Scholar

OPENALEX - Publications

Nick Haupka Jack H. Culbert Alexander Schniedermann Najko Jahn Philipp Mayr

This study compares and analyses publication document types in the following bibliographic databases: OpenAlex, Scopus, Web of Science, Semantic Scholar PubMed. The results demonstrate that typologies can differ considerably between individual database providers. Moreover, distinction research non-research texts, which is required to identify relevant documents for bibliometric analysis, vary depending on data source because publications are classified differently respective databases. focus...

10.48550/arxiv.2406.15154 preprint EN arXiv (Cornell University) 2024-06-21

Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study

OPENALEX - Publications

Xu Tong Nina Smirnova Sharmila Upadhyaya Ran Yu Jack H. Culbert and 3 more

Objective: To explore and compare the performance of ChatGPT other state-of-the-art LLMs on domain-specific NER tasks covering different entity types domains in TCM against COVID-19 literature. Methods: We established a dataset 389 articles COVID-19, manually annotated 48 them with 6 entities belonging to 3 as ground truth, which can be assessed. then performed for using (GPT-3.5 GPT-4) 4 BERT-based question-answering (QA) models (RoBERTa, MiniLM, PubMedBERT SciBERT) without prior training...

10.48550/arxiv.2408.13501 preprint EN arXiv (Cornell University) 2024-08-24

4TCT, A 4chan Text Collection Tool

OPENALEX - Publications

Jack H. Culbert

4chan is a popular online imageboard which has been widely studied due to an observed concentration of far-right, antisemitic, racist, misogynistic, and otherwise hateful material being posted the site, as well emergence political movements evolution memes are there, discussed in Section 1.1. We have created tool developed Python utilises API collect data from selection boards. This paper accompanies release code via github repository: https://github.com/jhculb/4TCT. believe this will be use...

10.48550/arxiv.2307.03556 preprint EN cc-by-sa arXiv (Cornell University) 2023-01-01

Utilizing Large Language Models for Named Entity Recognition in Traditional Chinese Medicine against COVID-19 Literature: Comparative Study (Preprint)

OPENALEX - Publications

Xu Tong Nina Smirnova Sharmila Upadhyaya Ran Yu Jack H. Culbert and 3 more

<sec> <title>BACKGROUND</title> Recent advances in large language models (LLMs) have shown remarkable performance on various downstream tasks zero- and few-shot scenarios, shedding light named entity recognition (NER) low-resource domains. Traditional Chinese medicine (TCM) against COVID-19 has been a new research topic led to niche literature. NER techniques are crucial for extracting utilizing the rich knowledge such </sec> <title>OBJECTIVE</title> To explore compare of ChatGPT other...

10.2196/preprints.54346 preprint EN 2023-11-07

Coming Soon ...