Zeyd Boukhers

ORCID: 0000-0001-9778-9164
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Data Quality and Management
  • Topic Modeling
  • Biomedical Text Mining and Ontologies
  • Video Surveillance and Tracking Methods
  • Misinformation and Its Impacts
  • Advanced Image and Video Retrieval Techniques
  • Natural Language Processing Techniques
  • Advanced Text Analysis Techniques
  • Semantic Web and Ontologies
  • Mathematics, Computing, and Information Processing
  • Blockchain Technology Applications and Security
  • Software Engineering Research
  • Advanced Vision and Imaging
  • Privacy-Preserving Technologies in Data
  • Research Data Management Practices
  • Scientific Computing and Data Management
  • Advanced Neural Network Applications
  • AI in cancer detection
  • Human Pose and Action Recognition
  • Image Processing and 3D Reconstruction
  • Hate Speech and Cyberbullying Detection
  • Multimodal Machine Learning Applications
  • Machine Learning in Healthcare
  • Robotics and Sensor-Based Localization
  • Medical Coding and Health Information

Fraunhofer Institute for Applied Information Technology
2022-2024

University Hospital Cologne
2022-2023

University of Cologne
2023

University of Koblenz and Landau
2019-2023

Universität Koblenz
2019-2023

University of Stuttgart
2021

University of Siegen
2015-2017

The availability of metadata for scientific documents is pivotal in propelling knowledge forward and adhering to the FAIR principles (i.e. Findability, Accessibility, Interoperability, Reusability) research findings. However, lack sufficient published documents, particularly those from smaller mid-sized publishers, hinders their accessibility. This issue widespread some disciplines, such as German Social Sciences, where publications often employ diverse templates. To address this challenge,...

10.48550/arxiv.2501.05082 preprint EN arXiv (Cornell University) 2025-01-09

Europe's healthcare systems require enhanced interoperability and digitalization, driving a demand for innovative solutions to process legacy clinical data. This paper presents the results of our project, which aims leverage Large Language Models (LLMs) extract structured information from unstructured reports, focusing on patient history, diagnoses, treatments, other predefined categories. We developed workflow with user interface evaluated LLMs varying sizes through prompting strategies...

10.48550/arxiv.2502.05638 preprint EN arXiv (Cornell University) 2025-02-08

In the digital age, data has emerged as one of most valuable assets across various sectors, including academia, industry, and healthcare. Effective preservation involves management to ensure its long-term accessibility usability. Given importance sensitivity data, need for effective is a crucial necessity. One big recent proposed approaches FAIR Digital Objects (FDOs) which revolutionize field preservation. Central this revolution alignment FDOs with principles (Findable, Accessible,...

10.52825/ocp.v5i.1421 article EN cc-by Open Conference Proceedings 2025-03-18

Edge detection is a fundamental technique in various computer vision tasks. Edges are indeed effectively delineated by pixel discontinuity and can offer reliable structural information even textureless areas. State-of-the-art heavily relies on pixel-wise annotations, which labor-intensive subject to inconsistencies when acquired manually. In this work, we propose novel self-supervised approach for edge that employs multi-level, multi-homography transfer annotations from synthetic real-world...

10.48550/arxiv.2401.02313 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Vojta-therapy is a useful technique for the treatment of physical and mental impairments in humans, very effective children less than 6 months. During therapy, specific stimulation given to patient's body perform certain reflexive pattern movements. The repetition this ultimately makes previously blocked connections between spinal cord brain available, after few session, patients can these movements without any external stimulation. must be performed several times day or week last weeks...

10.1109/icip.2016.7532555 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2016-08-17

This demo paper presents a generic toolchain to extract, segment and match literature references from full text PDF files in the project EXCITE. The aim of EXCITE is extracting matching citations social science publications making more citation data available researchers. Each single step pipeline open source tools used accomplish tasks are explained. public system which integrates all components under an user-friendly interface put forward illustrated. As final step, special component...

10.1109/jcdl.2019.00105 article EN 2019-06-01

Our vision paper outlines a plan to improve the future of semantic interoperability in data spaces through application machine learning. The use spaces, where is exchanged among members self-regulated environment, becoming increasingly popular. However, current manual practices managing metadata and vocabularies these are time-consuming, prone errors, may not meet needs all stakeholders. By leveraging power learning, we believe that can be significantly improved. This involves automatically...

10.1145/3543873.3587658 article EN 2023-04-28

Topic modeling is a popular technique for clustering large collections of text documents. A variety different types regularization implemented in topic modeling. In this paper, we propose novel approach analyzing the influence on results Based Renyi entropy, inspired by concepts from statistical physics, where an inferred topical structure collection can be considered information system residing non-equilibrium state. By testing our four models-Probabilistic Latent Semantic Analysis (pLSA),...

10.3390/e22040394 article EN cc-by Entropy 2020-03-30

This paper addresses the problem of extracting and segmenting references from PDF documents. The novelty presented approach lies in its capability to discover highly varying mainly terms content, length location document. Unlike existing works, proposed method does not follow classical pipeline that consists sequential phases. It rather learns different characteristics be used a coherent scheme reduces error accumulation by following probabilistic approach. Contrary conventional references,...

10.1109/jcdl.2019.00035 article EN 2019-06-01

Due to the significant advancement of Natural Language Processing and Computer Vision-based models, Visual Question Answering (VQA) systems are becoming more intelligent advanced. However, they still error-prone when dealing with relatively complex questions. Therefore, it is important understand behaviour VQA models before adopting their results. In this paper, we introduce an interpretability approach for by generating counterfactual images. Specifically, generated image supposed have...

10.3390/s22062245 article EN cc-by Sensors 2022-03-14

Abstract In the academic world, number of scientists grows every year and so does authors sharing same names. Consequently, it is challenging to assign newly published papers their respective authors. Therefore, author name ambiguity considered a critical open problem in digital libraries. This paper proposes an disambiguation approach that links names real-world entities by leveraging co-authors domain research. To this end, we use data collected from DBLP repository contains more than 5...

10.1007/s00799-023-00361-6 article EN cc-by International Journal on Digital Libraries 2023-05-04

In this paper we apply multifractal formalism to the analysis of statistical behaviour topic models under condition varying number topics. Our reveals existence two self-similar regions and one transition region in function density-of-states depending on As earlier a that can be expressed through was successfully used determine optimal topics, test applicability for same purpose. We provide numerical results three (PLSA, ARTM, LDA Gibbs sampling) marked-up collections containing texts...

10.1088/1742-6596/1163/1/012025 article EN Journal of Physics Conference Series 2019-02-01

For semantic analysis of activities and events in videos, it is important to capture the spatio-temporal relation among objects 3D space. In this paper, we present a probabilistic method that extracts trajectories from 2D captured monocular moving camera. Compared with existing methods rely on restrictive assumptions, propose can extract much less restriction by adopting new example-based techniques, which compensate lack information. Here, estimate focal length camera based similar...

10.1109/tcsvt.2017.2727963 article EN IEEE Transactions on Circuits and Systems for Video Technology 2017-07-17

Since the birth of Bitcoin in 2009, cryptocurrencies have emerged to become a global phenomenon and an important decentralized financial asset. Due this decentralization, value these digital currencies against fiat is highly volatile over time. Therefore, forecasting crypto-fiat currency exchange rate extremely challenging task. For reliable forecasting, paper proposes multimodal AdaBoost-LSTM ensemble approach that employs all modalities which derive price fluctuation such as social media...

10.48550/arxiv.2202.08967 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

Abstract Skeleton Ground Truth (GT) is critical to the success of supervised skeleton extraction methods, especially with popularity deep learning techniques. Furthermore, we see GTs used not only for training detectors Convolutional Neural Networks (CNN), but also evaluating skeleton-related pruning and matching algorithms. However, most existing shape image datasets suffer from lack GT inconsistency standards. As a result, it difficult evaluate reproduce CNN-based algorithms on fair basis....

10.1007/s11263-023-01926-3 article EN cc-by International Journal of Computer Vision 2023-11-01

In contrast to most of the English scientific publications that follow standard and simple layouts, order, content, position size metadata in German vary greatly among publications. This variety makes traditional NLP methods fail accurately extract from these this paper, we present a method extracts PDF documents with different layouts styles by viewing document as an image. We used Mask R-CNN which is trained on COCO dataset finetuned PubLayNet consists 200K snapshots five basic classes...

10.1109/jcdl52503.2021.00076 preprint EN 2021-09-01

To detect an event which is defined by the interaction of objects in a video, it necessary to capture their spatio-temporal relation. However, video only displays original 3D space projected onto 2D image plane. This paper introduces method extracts trajectories from videos. Each trajectory represents transition object's positions space. We extract such combining object detection with depth estimation that estimates information The major problem for this inconsistency between and results....

10.1109/cbmi.2015.7153632 article EN 2015-06-01

Due to the significant advancement of Natural Language Processing and Computer Vision-based models, Visual Question Answering (VQA) systems are becoming more intelligent advanced. However, they still error-prone when dealing with relatively complex questions. Therefore, it is important understand behaviour VQA models before adopting their results. In this paper, we introduce an interpretability approach for by generating counterfactual images. Specifically, generated image supposed have...

10.48550/arxiv.2201.03342 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

The challenge of automatically extracting metadata from scientific PDF documents varies depending on the diversity layouts within collection. In some disciplines such as German social sciences, authors are not required to generate their papers according a specific template and they often create own templates which yield high appearance across publications. Overcoming this using only Natural Language Processing (NLP) approaches is always effective reflected in unavailability large portion...

10.1145/3529372.3533295 article EN 2022-06-06
Coming Soon ...