NFDI4DS | UHH-SEMS - Publication Details

Zeyd Boukhers

ORCID: 0000-0001-9778-9164

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5052763223

Research Areas

Data Quality and Management
Topic Modeling
Biomedical Text Mining and Ontologies
Video Surveillance and Tracking Methods
Misinformation and Its Impacts
Advanced Image and Video Retrieval Techniques
Natural Language Processing Techniques
Advanced Text Analysis Techniques
Semantic Web and Ontologies
Mathematics, Computing, and Information Processing
Blockchain Technology Applications and Security
Software Engineering Research
Advanced Vision and Imaging
Privacy-Preserving Technologies in Data
Research Data Management Practices
Scientific Computing and Data Management
Advanced Neural Network Applications
AI in cancer detection
Human Pose and Action Recognition
Image Processing and 3D Reconstruction
Hate Speech and Cyberbullying Detection
Multimodal Machine Learning Applications
Machine Learning in Healthcare
Robotics and Sensor-Based Localization
Medical Coding and Health Information

Fraunhofer Institute for Applied Information Technology
2022-2024

University Hospital Cologne
2022-2023

University of Cologne
2023

University of Koblenz and Landau
2019-2023

Universität Koblenz
2019-2023

University of Stuttgart
2021

University of Siegen
2015-2017

Comparison of Feature Learning Methods for Metadata Extraction from PDF Scholarly Documents

OPENALEX - Publications

Zeyd Boukhers Cong Yang

The availability of metadata for scientific documents is pivotal in propelling knowledge forward and adhering to the FAIR principles (i.e. Findability, Accessibility, Interoperability, Reusability) research findings. However, lack sufficient published documents, particularly those from smaller mid-sized publishers, hinders their accessibility. This issue widespread some disciplines, such as German Social Sciences, where publications often employ diverse templates. To address this challenge,...

10.48550/arxiv.2501.05082 preprint EN arXiv (Cornell University) 2025-01-09

ELMTEX: Fine-Tuning Large Language Models for Structured Clinical Information Extraction. A Case Study on Clinical Reports

OPENALEX - Publications

Aynur Guluzade Naguib Heiba Zeyd Boukhers Florim Hamiti Jahid Hasan Polash and 2 more

Europe's healthcare systems require enhanced interoperability and digitalization, driving a demand for innovative solutions to process legacy clinical data. This paper presents the results of our project, which aims leverage Large Language Models (LLMs) extract structured information from unstructured reports, focusing on patient history, diagnoses, treatments, other predefined categories. We developed workflow with user interface evaluated LLMs varying sizes through prompting strategies...

10.48550/arxiv.2502.05638 preprint EN arXiv (Cornell University) 2025-02-08

FDO Manager: Minimum Viable FAIR Digital Object Implementation

OPENALEX - Publications

Oussama Zoubia Nagaraj Bahubali Asundi Adamantios Koumpis Christoph Lange Sezin Dogan and 2 more

In the digital age, data has emerged as one of most valuable assets across various sectors, including academia, industry, and healthcare. Effective preservation involves management to ensure its long-term accessibility usability. Given importance sensitivity data, need for effective is a crucial necessity. One big recent proposed approaches FAIR Digital Objects (FDOs) which revolutionize field preservation. Central this revolution alignment FDOs with principles (Findable, Accessible,...

10.52825/ocp.v5i.1421 article EN cc-by Open Conference Proceedings 2025-03-18

BladeView: Toward Automatic Wind Turbine Inspection With Unmanned Aerial Vehicle

OPENALEX - Publications

Cong Yang Zhou Hua Xun Liu Yan Ke Bo Gao and 4 more

10.1109/tase.2024.3464640 article EN cc-by IEEE Transactions on Automation Science and Engineering 2024-01-01

SuperEdge: Towards a Generalization Model for Self-Supervised Edge Detection

OPENALEX - Publications

Leng Kai Zhijie Zhang Jie Liu Zeyd Boukhers Wei Sui and 2 more

Edge detection is a fundamental technique in various computer vision tasks. Edges are indeed effectively delineated by pixel discontinuity and can offer reliable structural information even textureless areas. State-of-the-art heavily relies on pixel-wise annotations, which labor-intensive subject to inconsistencies when acquired manually. In this work, we propose novel self-supervised approach for edge that employs multi-level, multi-homography transfer annotations from synthetic real-world...

10.48550/arxiv.2401.02313 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Automatic recognition of movement patterns in the vojta-therapy using RGB-D data

OPENALEX - Publications

Muhammad Hassan Khan Jullien Helsper Zeyd Boukhers Marcin Grzegorzek

Vojta-therapy is a useful technique for the treatment of physical and mental impairments in humans, very effective children less than 6 months. During therapy, specific stimulation given to patient's body perform certain reflexive pattern movements. The repetition this ultimately makes previously blocked connections between spinal cord brain available, after few session, patients can these movements without any external stimulation. must be performed several times day or week last weeks...

10.1109/icip.2016.7532555 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2016-08-17

EXCITE – A Toolchain to Extract, Match and Publish Open Literature References

OPENALEX - Publications

Azam Hosseini Behnam Ghavimi Zeyd Boukhers Philipp Mayr

This demo paper presents a generic toolchain to extract, segment and match literature references from full text PDF files in the project EXCITE. The aim of EXCITE is extracting matching citations social science publications making more citation data available researchers. Each single step pipeline open source tools used accomplish tasks are explained. public system which integrates all components under an user-friendly interface put forward illustrated. As final step, special component...

10.1109/jcdl.2019.00105 article EN 2019-06-01

Enhancing Data Space Semantic Interoperability through Machine Learning: a Visionary Perspective

OPENALEX - Publications

Zeyd Boukhers Christoph Lange Oya Beyan

Our vision paper outlines a plan to improve the future of semantic interoperability in data spaces through application machine learning. The use spaces, where is exchanged among members self-regulated environment, becoming increasingly popular. However, current manual practices managing metadata and vocabularies these are time-consuming, prone errors, may not meet needs all stakeholders. By leveraging power learning, we believe that can be significantly improved. This involves automatically...

10.1145/3543873.3587658 article EN 2023-04-28

Analyzing the Influence of Hyper-parameters and Regularizers of Topic Modeling in Terms of Renyi Entropy

OPENALEX - Publications

Sergei Koltcov Vera Ignatenko Zeyd Boukhers Steffen Staab

Topic modeling is a popular technique for clustering large collections of text documents. A variety different types regularization implemented in topic modeling. In this paper, we propose novel approach analyzing the influence on results Based Renyi entropy, inspired by concepts from statistical physics, where an inferred topical structure collection can be considered information system residing non-equilibrium state. By testing our four models-Probabilistic Latent Semantic Analysis (pLSA),...

10.3390/e22040394 article EN cc-by Entropy 2020-03-30

An End-to-End Approach for Extracting and Segmenting High-Variance References from PDF Documents

OPENALEX - Publications

Zeyd Boukhers Shriharsh Ambhore Steffen Staab

This paper addresses the problem of extracting and segmenting references from PDF documents. The novelty presented approach lies in its capability to discover highly varying mainly terms content, length location document. Unlike existing works, proposed method does not follow classical pipeline that consists sequential phases. It rather learns different characteristics be used a coherent scheme reduces error accumulation by following probabilistic approach. Contrary conventional references,...

10.1109/jcdl.2019.00035 article EN 2019-06-01

COIN: Counterfactual Image Generation for Visual Question Answering Interpretation

OPENALEX - Publications

Zeyd Boukhers T. Hartmann Jan Jürjens

Due to the significant advancement of Natural Language Processing and Computer Vision-based models, Visual Question Answering (VQA) systems are becoming more intelligent advanced. However, they still error-prone when dealing with relatively complex questions. Therefore, it is important understand behaviour VQA models before adopting their results. In this paper, we introduce an interpretability approach for by generating counterfactual images. Specifically, generated image supposed have...

10.3390/s22062245 article EN cc-by Sensors 2022-03-14

Deep author name disambiguation using DBLP data

OPENALEX - Publications

Zeyd Boukhers Nagaraj Bahubali Asundi

Abstract In the academic world, number of scientists grows every year and so does authors sharing same names. Consequently, it is challenging to assign newly published papers their respective authors. Therefore, author name ambiguity considered a critical open problem in digital libraries. This paper proposes an disambiguation approach that links names real-world entities by leveraging co-authors domain research. To this end, we use data collected from DBLP repository contains more than 5...

10.1007/s00799-023-00361-6 article EN cc-by International Journal on Digital Libraries 2023-05-04

Fractal approach for determining the optimal number of topics in the field of topic modeling.

OPENALEX - Publications

Vera Ignatenko Sergei Koltcov Steffen Staab Zeyd Boukhers

In this paper we apply multifractal formalism to the analysis of statistical behaviour topic models under condition varying number topics. Our reveals existence two self-similar regions and one transition region in function density-of-states depending on As earlier a that can be expressed through was successfully used determine optimal topics, test applicability for same purpose. We provide numerical results three (PLSA, ARTM, LDA Gibbs sampling) marked-up collections containing texts...

10.1088/1742-6596/1163/1/012025 article EN Journal of Physics Conference Series 2019-02-01

Example-Based 3D Trajectory Extraction of Objects From 2D Videos

OPENALEX - Publications

Zeyd Boukhers Kimiaki Shirahama Marcin Grzegorzek

For semantic analysis of activities and events in videos, it is important to capture the spatio-temporal relation among objects 3D space. In this paper, we present a probabilistic method that extracts trajectories from 2D captured monocular moving camera. Compared with existing methods rely on restrictive assumptions, propose can extract much less restriction by adopting new example-based techniques, which compensate lack information. Here, estimate focal length camera based similar...

10.1109/tcsvt.2017.2727963 article EN IEEE Transactions on Circuits and Systems for Video Technology 2017-07-17

Ensemble and Multimodal Approach for Forecasting Cryptocurrency Price

OPENALEX - Publications

Zeyd Boukhers Azeddine Bouabdallah Matthias Löhr Jan Jürjens

Since the birth of Bitcoin in 2009, cryptocurrencies have emerged to become a global phenomenon and an important decentralized financial asset. Due this decentralization, value these digital currencies against fiat is highly volatile over time. Therefore, forecasting crypto-fiat currency exchange rate extremely challenging task. For reliable forecasting, paper proposes multimodal AdaBoost-LSTM ensemble approach that employs all modalities which derive price fluctuation such as social media...

10.48550/arxiv.2202.08967 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

Skeleton Ground Truth Extraction: Methodology, Annotation Tool and Benchmarks

OPENALEX - Publications

Cong Yang Bipin Indurkhya John See Bo Gao Yan Ke and 3 more

Abstract Skeleton Ground Truth (GT) is critical to the success of supervised skeleton extraction methods, especially with popularity deep learning techniques. Furthermore, we see GTs used not only for training detectors Convolutional Neural Networks (CNN), but also evaluating skeleton-related pruning and matching algorithms. However, most existing shape image datasets suffer from lack GT inconsistency standards. As a result, it difficult evaluate reproduce CNN-based algorithms on fair basis....

10.1007/s11263-023-01926-3 article EN cc-by International Journal of Computer Vision 2023-11-01

MexPub: Deep Transfer Learning for Metadata Extraction from German Publications

OPENALEX - Publications

Zeyd Boukhers Nada Beili T. Hartmann Prantik Goswami Muhammad Arslan Zafar

In contrast to most of the English scientific publications that follow standard and simple layouts, order, content, position size metadata in German vary greatly among publications. This variety makes traditional NLP methods fail accurately extract from these this paper, we present a method extracts PDF documents with different layouts styles by viewing document as an image. We used Mask R-CNN which is trained on COCO dataset finetuned PubLayNet consists 200K snapshots five basic classes...

10.1109/jcdl52503.2021.00076 preprint EN 2021-09-01

Object detection and depth estimation for 3D trajectory extraction

OPENALEX - Publications

Zeyd Boukhers Kimiaki Shirahama Frédéric Li Marcin Grzegorzek

To detect an event which is defined by the interaction of objects in a video, it necessary to capture their spatio-temporal relation. However, video only displays original 3D space projected onto 2D image plane. This paper introduces method extracts trajectories from videos. Each trajectory represents transition object's positions space. We extract such combining object detection with depth estimation that estimates information The major problem for this inconsistency between and results....

10.1109/cbmi.2015.7153632 article EN 2015-06-01

COIN: Counterfactual Image Generation for VQA Interpretation

OPENALEX - Publications

Zeyd Boukhers T. Hartmann Jan Jürjens

10.48550/arxiv.2201.03342 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

Vision and natural language for metadata extraction from scientific PDF documents

OPENALEX - Publications

Zeyd Boukhers Azeddine Bouabdallah

The challenge of automatically extracting metadata from scientific PDF documents varies depending on the diversity layouts within collection. In some disciplines such as German social sciences, authors are not required to generate their papers according a specific template and they often create own templates which yield high appearance across publications. Overcoming this using only Natural Language Processing (NLP) approaches is always effective reflected in unavailability large portion...

10.1145/3529372.3533295 article EN 2022-06-06

Coming Soon ...