NFDI4DS | UHH-SEMS - Publication Details

EXCITE – A Toolchain to Extract, Match and Publish Open Literature References

OPENALEX - Publications

Azam Hosseini Behnam Ghavimi Zeyd Boukhers Philipp Mayr

This demo paper presents a generic toolchain to extract, segment and match literature references from full text PDF files in the project EXCITE. The aim of EXCITE is extracting matching citations social science publications making more citation data available researchers. Each single step pipeline open source tools used accomplish tasks are explained. public system which integrates all components under an user-friendly interface put forward illustrated. As final step, special component...

10.1109/jcdl.2019.00105 article EN 2019-06-01

Identifying and Improving Dataset References in Social Sciences Full Texts

OPENALEX - Publications

Behnam Ghavimi Philipp Mayr Sahar Vahdati Christoph Lange

Scientific full text papers are usually stored in separate places than their underlying research datasets. Authors typically make references to datasets by mentioning them for example using titles and the year of publication. However, most cases explicit links that would provide readers with direct access referenced missing. Manually detecting is time consuming requires an expert domain paper. In order all have been published already, we suggest evaluate a semi-automatic approach finding...

10.5281/zenodo.44608 preprint EN arXiv (Cornell University) 2016-03-06

A semi-automatic approach for detecting dataset references in social science texts

OPENALEX - Publications

Behnam Ghavimi Philipp Mayr Christoph Lange Sahar Vahdati Sören Auer

Today, full-texts of scientific articles are often stored in different locations than the used datasets.Dataset registries aim at a closer integration by making datasets citable but authors typically refer to using inconsistent abbreviations and heterogeneous metadata (e.g.title, publication year).It is thus hard reproduce research results, access for further analysis, determine impact dataset.Manually detecting references time-consuming requires expert knowledge underlying domain.We propose...

10.3233/isu-160816 article EN Information Services & Use 2016-12-24

Identifying and Improving Dataset References in Social Sciences Full Texts

OPENALEX - Publications

Behnam Ghavimi Philipp Mayr Sahar Vahdati Christoph Lange

Scientific full text papers are usually stored in separate places than their underlying research datasets. Authors typically make references to datasets by mentioning them for example using titles and the year of publication. However, most cases explicit links that would provide readers with direct access referenced missing. Manually detecting is time consuming requires an expert domain paper. In order all have been published already, we suggest evaluate a semi-automatic approach finding...

10.48550/arxiv.1603.01774 preprint EN other-oa arXiv (Cornell University) 2016-01-01

EXmatcher: Combining Features Based on Reference Strings and Segments to Enhance Citation Matching

OPENALEX - Publications

Behnam Ghavimi Wolfgang Otto Philipp Mayr

Citation matching is a challenging task due to different problems such as the variety of citation styles, mistakes in reference strings and quality identified segments. The classic configuration used this paper combination blocking technique binary classifier. Three possible inputs (reference strings, segments segments) were tested find most efficient strategy for matching. In classification step, we describe effect which probabilities can have Our evaluation on manually curated gold...

10.48550/arxiv.1906.04484 preprint EN other-oa arXiv (Cornell University) 2019-01-01

A Semi-Automatic Approach for Detecting Dataset References in Social Science Texts

OPENALEX - Publications

Behnam Ghavimi Philipp Mayr Christoph Lange Sahar Vahdati Sören Auer

Today, full-texts of scientific articles are often stored in different locations than the used datasets. Dataset registries aim at a closer integration by making datasets citable but authors typically refer to using inconsistent abbreviations and heterogeneous metadata (e.g. title, publication year). It is thus hard reproduce research results, access for further analysis, determine impact dataset. Manually detecting references time-consuming requires expert knowledge underlying domain.We...

10.48550/arxiv.1611.01820 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Highly cited references in PLOS ONE and their in-text usage over time

OPENALEX - Publications

Wolfgang Otto Behnam Ghavimi Philipp Mayr Rajesh Piryani Vivek Kumar Singh

In this article, we describe highly cited publications in a PLOS ONE full-text corpus. For these publications, analyse the citation contexts concerning their position text and age at time of citing. By selecting perspective papers, can distinguish them based on context during even if do not have any other information source or metrics. We top references how, when which they are cited. The focus study is to explain nature reception papers. found that distinguishable by IMRaD sections...

10.48550/arxiv.1903.11693 preprint EN other-oa arXiv (Cornell University) 2019-01-01

The OpenCitations Data Model

OPENALEX - Publications

Marilena Daquino Silvio Peroni David M. Shotton Giovanni Colavizza Behnam Ghavimi and 4 more

A variety of schemas and ontologies are currently used for the machine-readable description bibliographic entities citations. This diversity, reuse same ontology terms with different nuances, generates inconsistencies in data. Adoption a single data model would facilitate integration tasks regardless supplier or context application. In this paper we present OpenCitations Data Model (OCDM), generic describing citations, developed using Semantic Web technologies. We also evaluate effective...

10.48550/arxiv.2005.11981 preprint EN other-oa arXiv (Cornell University) 2020-01-01