NFDI4DS | UHH-SEMS - Publication Details

Paul Donner

ORCID: 0000-0001-5737-8483

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5000931815

Research Areas

scientometrics and bibliometrics research
Biomedical Text Mining and Ontologies
Topic Modeling
Research Data Management Practices
Innovation, Technology, and Society
Data Quality and Management
Meta-analysis and systematic reviews
Advanced Research in Science and Engineering
Expert finding and Q&A systems
Advanced Clustering Algorithms Research
Data-Driven Disease Surveillance
Advanced Data Processing Techniques
Natural Language Processing Techniques
Advanced Text Analysis Techniques
Astronomical Observations and Instrumentation
Corporate Governance and Management
Online Learning and Analytics
Regional Development and Policy
Spatial and Panel Data Analysis
Web visibility and informetrics
Evaluation and Performance Assessment
Technology Adoption and User Behaviour
Gender and Technology in Education
Advanced Statistical Process Monitoring
Health and Medical Studies

German Centre for Higher Education Research and Science Studies
2017-2025

Institute for Research Information and Quality Assurance
2014

Humboldt-Universität zu Berlin
2011

Document type assignment accuracy in the journal citation index data of Web of Science

OPENALEX - Publications

Paul Donner

10.1007/s11192-017-2483-y article EN Scientometrics 2017-08-04

On author self-citations as carriers of scientific topic structure information

OPENALEX - Publications

Paul Donner Edwin A. Henneken

Abstract Author self-citations are a somewhat controversial phenomenon. Some scholars maintain they normal, even indispensable, part of scientific referencing practice, while others claim frequently an expression vanity and self-promotion. Citations the basic data for citation network clustering, important approach to creating bottom-up, data-driven, global taxonomic systems research publications. Thus topical information content is particular interest in this context. Since it not yet known...

10.1162/qss_a_00357 article EN cc-by Quantitative Science Studies 2025-02-24

Reference coverage analysis of OpenAlex compared to Web of Science and Scopus

OPENALEX - Publications

Jack H. Culbert Anne Hobert Najko Jahn Nick Haupka Marion Schmidt and 2 more

Abstract OpenAlex is a promising open source of scholarly metadata, and competitor to established proprietary sources, such as the Web Science Scopus. As provides its data freely openly, it permits researchers perform bibliometric studies that can be reproduced in community without licensing barriers. However, rapidly evolving contained within expanding also quickly changing, question naturally arises trustworthiness data. In this report, we will study reference coverage selected metadata...

10.1007/s11192-025-05293-3 article EN cc-by Scientometrics 2025-04-10

Comparing institutional-level bibliometric research performance indicator values based on different affiliation disambiguation systems

OPENALEX - Publications

Paul Donner Christine Rimmert Nees Jan van Eck

The present study is an evaluation of three frequently used institution name disambiguation systems. Web Science normalized names and Organization Enhanced system the Scopus Affiliation ID are tested against a complete, independent for sample German public sector research organizations. as gold standard in evaluations that we perform. We coverage systems and, particular, differences number commonly bibliometric indicators. key finding institutions, studied provide indicator values have only...

10.1162/qss_a_00013 article EN cc-by Quantitative Science Studies 2019-12-11

Reference Coverage Analysis of OpenAlex compared to Web of Science and Scopus

OPENALEX - Publications

Jack H. Culbert Anne Hobert Najko Jahn Nick Haupka Marion Schmidt and 2 more

OpenAlex is a promising open source of scholarly metadata, and competitor to the established proprietary sources, Web Science Scopus. As provides its data freely openly, it permits researchers perform bibliometric studies that can be reproduced in community without licensing barriers. However, as rapidly evolving contained within expanding also quickly changing, question naturally arises trustworthiness data. In this empirical paper, we will study reference metadata coverage each database...

10.48550/arxiv.2401.16359 preprint EN arXiv (Cornell University) 2024-01-29

Effect of publication month on citation impact

OPENALEX - Publications

Paul Donner

10.1016/j.joi.2018.01.012 article EN Journal of Informetrics 2018-02-01

Reading in 2110 – reading behavior and reading devices:a case study

OPENALEX - Publications

Kathrin Grzeschik Yevgeniya Kruppa Diana Marti Paul Donner

Purpose The purpose of these experiments is to find out whether and how reading behavior might be influenced by devices. Design/methodology/approach In total, three experiments, the first one more independent from second third, investigate European Library Information Science students react electronic devices, unfamiliar as they are with them. third explore implications such rate, concentration symptoms fatigue in conjunction Test objects were Sony eBook Reader, IREX iLiad, LCD computer...

10.1108/02640471111141052 article EN The Electronic Library 2011-06-07

Data inaccuracy quantification and uncertainty propagation for bibliometric indicators

OPENALEX - Publications

Paul Donner

Abstract This study introduces an approach to estimate the uncertainty in bibliometric indicator values that is caused by data errors. utilizes Bayesian regression models, estimated from empirical samples, which are used predict error-free data. Through direct Monte Carlo simulation—drawing many replicates of predicted models for same input data—probability distributions can be obtained provide information on their due It demonstrated how base quantities, such as number publications certain...

10.1093/reseval/rvae047 article EN Research Evaluation 2024-01-01

The implicit preference of bibliometrics for basic research

OPENALEX - Publications

Paul Donner Ulrich Schmoch

Abstract By individually associating articles to basic or applied research, it is shown that are cited more frequently than ones. Dividing the subject categories of Web Science into a and an part, mean field-normalization rate referred part depending on research orientation paper analysed. this approach, distinct difference citations for parts most found. However, differences citation scores organisations found as well, but less clear. The explanation generally publish mix articles. In...

10.1007/s11192-020-03516-3 article EN cc-by Scientometrics 2020-05-25

Validation of the Astro dataset clustering solutions with external data

OPENALEX - Publications

Paul Donner

10.1007/s11192-020-03780-3 article EN Scientometrics 2020-11-21

Enhanced self‐citation detection by fuzzy author name matching and complementary error estimates

OPENALEX - Publications

Paul Donner

In this article I investigate the shortcomings of exact string match‐based author self‐citation detection methods. The contributions study are twofold. First, apply a fuzzy matching algorithm for and benchmark approach other common methods exclusively name‐based against manually curated ground truth sample. Near full recall can be achieved with proposed method while incurring only negligible precision loss. Second, report some important observations from results about extent latent...

10.1002/asi.23399 article EN Journal of the Association for Information Science and Technology 2014-12-02

Identifying constitutive articles of cumulative dissertation theses by bilingual text similarity. Evaluation of similarity methods on a new short text task

OPENALEX - Publications

Paul Donner

Abstract Cumulative dissertations are doctoral theses comprised of multiple published articles. For studies publication activity and citation impact early career researchers, it is important to identify these articles link them their associated theses. Using a new benchmark data set, this paper reports on experiments measuring the bilingual textual similarity between, one hand, titles keywords theses, and, other articles’ abstracts. The tested methods cosine L1 distance in Vector Space Model...

10.1162/qss_a_00152 article EN cc-by Quantitative Science Studies 2021-01-01

Citation analysis of Ph.D. theses with data from Scopus and Google Books

OPENALEX - Publications

Paul Donner

Abstract This study investigates the potential of citation analysis Ph.D. theses to obtain valid and useful early career performance indicators at level university departments. For German from 1996 2018 suitability data Scopus Google Books is studied found be sufficient quantitative estimates researchers’ departmental in terms scientific recognition use their dissertations as reflected citations. citations complement each other have little overlap. Individual theses’ counts are much higher...

10.1007/s11192-021-04173-w article EN cc-by Scientometrics 2021-10-24

A validation of coauthorship credit models with empirical data from the contributions of PhD candidates

OPENALEX - Publications

Paul Donner

A perennial problem in bibliometrics is the appropriate distribution of authorship credit for coauthored publications. Several allocation methods and formulas have been introduced, but there has little empirical validation as to which method best reflects typical contributions coauthors. This paper presents a using new data set author-provided percentage contribution figures obtained from publications cumulative PhD theses by authors three countries that contain statements. The comparison...

10.1162/qss_a_00048 article EN cc-by Quantitative Science Studies 2020-05-04

Remarks on modified fractional counting

OPENALEX - Publications

Paul Donner

10.1016/j.joi.2024.101585 article EN Journal of Informetrics 2024-08-30

Towards a valid bibliometric measure of epistemic breadth of researchers

OPENALEX - Publications

Paul Donner Clemens Blümel

The concept of epistemic breadth the work a researcher refers to scope their knowledge claims, as reflected in published research reports. Studies have been hampered by lack validated measure concept. Here we introduce space approach measurement and propose use semantic similarity network an author's publication record operationalize measure. In this approach, each paper has its own location common abstract vector based on content. Proximity corresponds thematic publications. Candidate...

10.48550/arxiv.2411.02005 preprint EN arXiv (Cornell University) 2024-11-04

Algorithmic identification of Ph.D. thesis-related publications: a proof-of-concept study

OPENALEX - Publications

Paul Donner

Abstract In this study we propose and evaluate a method to automatically identify the journal publications that are related Ph.D. thesis using bibliographical data of both items. We build manually curated ground truth dataset from German cumulative doctoral theses explicitly list included publications, which match with records in Scopus database. then test supervised classification methods on task identifying correct associated among high numbers potential candidates features publication...

10.1007/s11192-022-04480-w article EN cc-by Scientometrics 2022-08-18

Drawbacks of Normalization by Percentile Ranks in Citation Impact Studies

OPENALEX - Publications

Paul Donner

10.6182/jlis.202212_20(2).075 article EN DOAJ (DOAJ: Directory of Open Access Journals) 2022-12-01

Correction to: The implicit preference of bibliometrics for basic research

OPENALEX - Publications

Paul Donner Ulrich Schmoch

10.1007/s11192-021-04181-w article EN cc-by Scientometrics 2021-11-06

Coming Soon ...