Behnam Ghavimi

ORCID: 0000-0002-4627-5371
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Biomedical Text Mining and Ontologies
  • Data Quality and Management
  • Topic Modeling
  • Semantic Web and Ontologies
  • Advanced Text Analysis Techniques
  • Natural Language Processing Techniques
  • Web Data Mining and Analysis
  • Research Data Management Practices
  • Algorithms and Data Compression
  • Computational Physics and Python Applications
  • Computational and Text Analysis Methods
  • Scientific Computing and Data Management
  • Time Series Analysis and Forecasting

GESIS - Leibniz-Institute for the Social Sciences
2016-2020

Leibniz Association
2016

University of Bonn
2016

This demo paper presents a generic toolchain to extract, segment and match literature references from full text PDF files in the project EXCITE. The aim of EXCITE is extracting matching citations social science publications making more citation data available researchers. Each single step pipeline open source tools used accomplish tasks are explained. public system which integrates all components under an user-friendly interface put forward illustrated. As final step, special component...

10.1109/jcdl.2019.00105 article EN 2019-06-01

Scientific full text papers are usually stored in separate places than their underlying research datasets. Authors typically make references to datasets by mentioning them for example using titles and the year of publication. However, most cases explicit links that would provide readers with direct access referenced missing. Manually detecting is time consuming requires an expert domain paper. In order all have been published already, we suggest evaluate a semi-automatic approach finding...

10.5281/zenodo.44608 preprint EN arXiv (Cornell University) 2016-03-06

Today, full-texts of scientific articles are often stored in different locations than the used datasets.Dataset registries aim at a closer integration by making datasets citable but authors typically refer to using inconsistent abbreviations and heterogeneous metadata (e.g.title, publication year).It is thus hard reproduce research results, access for further analysis, determine impact dataset.Manually detecting references time-consuming requires expert knowledge underlying domain.We propose...

10.3233/isu-160816 article EN Information Services & Use 2016-12-24

Scientific full text papers are usually stored in separate places than their underlying research datasets. Authors typically make references to datasets by mentioning them for example using titles and the year of publication. However, most cases explicit links that would provide readers with direct access referenced missing. Manually detecting is time consuming requires an expert domain paper. In order all have been published already, we suggest evaluate a semi-automatic approach finding...

10.48550/arxiv.1603.01774 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Citation matching is a challenging task due to different problems such as the variety of citation styles, mistakes in reference strings and quality identified segments. The classic configuration used this paper combination blocking technique binary classifier. Three possible inputs (reference strings, segments segments) were tested find most efficient strategy for matching. In classification step, we describe effect which probabilities can have Our evaluation on manually curated gold...

10.48550/arxiv.1906.04484 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Today, full-texts of scientific articles are often stored in different locations than the used datasets. Dataset registries aim at a closer integration by making datasets citable but authors typically refer to using inconsistent abbreviations and heterogeneous metadata (e.g. title, publication year). It is thus hard reproduce research results, access for further analysis, determine impact dataset. Manually detecting references time-consuming requires expert knowledge underlying domain.We...

10.48550/arxiv.1611.01820 preprint EN other-oa arXiv (Cornell University) 2016-01-01

In this article, we describe highly cited publications in a PLOS ONE full-text corpus. For these publications, analyse the citation contexts concerning their position text and age at time of citing. By selecting perspective papers, can distinguish them based on context during even if do not have any other information source or metrics. We top references how, when which they are cited. The focus study is to explain nature reception papers. found that distinguishable by IMRaD sections...

10.48550/arxiv.1903.11693 preprint EN other-oa arXiv (Cornell University) 2019-01-01

A variety of schemas and ontologies are currently used for the machine-readable description bibliographic entities citations. This diversity, reuse same ontology terms with different nuances, generates inconsistencies in data. Adoption a single data model would facilitate integration tasks regardless supplier or context application. In this paper we present OpenCitations Data Model (OCDM), generic describing citations, developed using Semantic Web technologies. We also evaluate effective...

10.48550/arxiv.2005.11981 preprint EN other-oa arXiv (Cornell University) 2020-01-01
Coming Soon ...