Zeyd Boukhers

ORCID: 0000-0001-9778-9164
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Natural Language Processing
  • Data Quality Assessment and Improvement
  • Biomedical Ontologies and Text Mining
  • Visual Object Tracking and Person Re-identification
  • Statistical Machine Translation and Natural Language Processing
  • Automatic Keyword Extraction from Textual Data
  • Image Feature Retrieval and Recognition Techniques
  • The Spread of Misinformation Online
  • Blockchain and Internet of Things Integration
  • Privacy-Preserving Techniques for Data Analysis and Machine Learning
  • Stereo Vision and Depth Estimation
  • Empirical Studies in Software Engineering
  • Mathematical Information Retrieval and Search
  • Deep Learning in Computer Vision and Image Recognition
  • Data Sharing and Stewardship in Science
  • Human Action Recognition and Pose Estimation
  • Automated Reconstruction of Fragmented Objects
  • Management and Reproducibility of Scientific Workflows
  • Visual Question Answering in Images and Videos
  • Semantic Web and Ontology Development
  • Automated Detection of Hate Speech and Offensive Language
  • Competition in Two-Sided Markets
  • Forensic Anthropological Research
  • Model-Based Clustering with Mixture Models
  • Archaeological Remote Sensing using Remote Sensing Techniques

Fraunhofer Institute for Applied Information Technology
2022-2024

University Hospital Cologne
2022-2023

University of Cologne
2023

University of Koblenz and Landau
2019-2023

University of Stuttgart
2021

University of Siegen
2015-2017

Edge detection is a fundamental technique in various computer vision tasks. Edges are indeed effectively delineated by pixel discontinuity and can offer reliable structural information even textureless areas. State-of-the-art heavily relies on pixel-wise annotations, which labor-intensive subject to inconsistencies when acquired manually. In this work, we propose novel self-supervised approach for edge that employs multi-level, multi-homography transfer annotations from synthetic real-world...

10.48550/arxiv.2401.02313 preprint EN other-oa arXiv (Cornell University) 2024-01-01

Our vision paper outlines a plan to improve the future of semantic interoperability in data spaces through application machine learning. The use spaces, where is exchanged among members self-regulated environment, becoming increasingly popular. However, current manual practices managing metadata and vocabularies these are time-consuming, prone errors, may not meet needs all stakeholders. By leveraging power learning, we believe that can be significantly improved. This involves automatically...

10.1145/3543873.3587658 article EN 2023-04-28

Vojta-therapy is a useful technique for the treatment of physical and mental impairments in humans, very effective children less than 6 months. During therapy, specific stimulation given to patient's body perform certain reflexive pattern movements. The repetition this ultimately makes previously blocked connections between spinal cord brain available, after few session, patients can these movements without any external stimulation. must be performed several times day or week last weeks...

10.1109/icip.2016.7532555 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2016-08-17

This demo paper presents a generic toolchain to extract, segment and match literature references from full text PDF files in the project EXCITE. The aim of EXCITE is extracting matching citations social science publications making more citation data available researchers. Each single step pipeline open source tools used accomplish tasks are explained. public system which integrates all components under an user-friendly interface put forward illustrated. As final step, special component...

10.1109/jcdl.2019.00105 article EN 2019-06-01

Abstract In the academic world, number of scientists grows every year and so does authors sharing same names. Consequently, it is challenging to assign newly published papers their respective authors. Therefore, author name ambiguity considered a critical open problem in digital libraries. This paper proposes an disambiguation approach that links names real-world entities by leveraging co-authors domain research. To this end, we use data collected from DBLP repository contains more than 5...

10.1007/s00799-023-00361-6 article EN cc-by International Journal on Digital Libraries 2023-05-04

Topic modeling is a popular technique for clustering large collections of text documents. A variety different types regularization implemented in topic modeling. In this paper, we propose novel approach analyzing the influence on results Based Renyi entropy, inspired by concepts from statistical physics, where an inferred topical structure collection can be considered information system residing non-equilibrium state. By testing our four models-Probabilistic Latent Semantic Analysis (pLSA),...

10.3390/e22040394 article EN cc-by Entropy 2020-03-30

This paper addresses the problem of extracting and segmenting references from PDF documents. The novelty presented approach lies in its capability to discover highly varying mainly terms content, length location document. Unlike existing works, proposed method does not follow classical pipeline that consists sequential phases. It rather learns different characteristics be used a coherent scheme reduces error accumulation by following probabilistic approach. Contrary conventional references,...

10.1109/jcdl.2019.00035 article EN 2019-06-01

Due to the significant advancement of Natural Language Processing and Computer Vision-based models, Visual Question Answering (VQA) systems are becoming more intelligent advanced. However, they still error-prone when dealing with relatively complex questions. Therefore, it is important understand behaviour VQA models before adopting their results. In this paper, we introduce an interpretability approach for by generating counterfactual images. Specifically, generated image supposed have...

10.3390/s22062245 article EN cc-by Sensors 2022-03-14

Since the birth of Bitcoin in 2009, cryptocurrencies have emerged to become a global phenomenon and an important decentralized financial asset. Due this decentralization, value these digital currencies against fiat is highly volatile over time. Therefore, forecasting crypto-fiat currency exchange rate extremely challenging task. For reliable forecasting, paper proposes multimodal AdaBoost-LSTM ensemble approach that employs all modalities which derive price fluctuation such as social media...

10.48550/arxiv.2202.08967 preprint EN cc-by-nc-nd arXiv (Cornell University) 2022-01-01

Traditional data monetization approaches face challenges related to protection and logistics. In response, digital marketplaces have emerged as intermediaries simplifying transactions. Despite the growing establishment acceptance of marketplaces, significant hinder efficient trading. As a result, few companies can derive tangible value from their data, leading missed opportunities in understanding customers, pricing decisions, fraud prevention. this paper, we explore both technical...

10.48550/arxiv.2401.09199 preprint EN cc-by arXiv (Cornell University) 2024-01-01

The concept of FAIR Digital Objects (FDOs) aims to revolutionise the field digital preservation and accessibility in next few years. Central this revolution is alignment FDOs with (Findable, Accessible, Interoperable, Reusable) Principles, particularly emphasizing machine-actionability interoperability across diverse data ecosystems. This abstract introduces "FDO Manager", a Minimum Viable Implementation, designed optimize management following these principles FDO specifications. Manager...

10.48550/arxiv.2402.03812 preprint EN arXiv (Cornell University) 2024-02-06

This paper aims to tackle the challenge posed by increasing integration of software tools in research across various disciplines investigating application Falcon-7b for detection and classification mentions within scholarly texts. Specifically, study focuses on solving Subtask I Software Mention Detection Scholarly Publications (SOMD), which entails identifying categorizing from academic literature. Through comprehensive experimentation, explores different training strategies, including a...

10.48550/arxiv.2405.08514 preprint EN arXiv (Cornell University) 2024-05-14

Yawning detection is actively used in multimedia applications such as driver fatigue assessment and status monitoring. However, the accuracy robustness of existing yawning detectors are limited due to variations environments (especially lights), facial expressions, confusion behaviours (e.g., talking eating). This paper introduces a transformer-based method, YawnNet, for accurate by leveraging spatial-temporal encoding local cues. In particular, YawnNet contains data processing stage with...

10.1145/3652583.3657618 article EN 2024-05-30

Addressing the complexity of accurately classifying International Classification Diseases (ICD) codes from medical discharge summaries is challenging due to intricate nature documentation. This paper explores use Large Language Models (LLM), specifically LLAMA architecture, enhance ICD code classification through two methodologies: direct application as a classifier and generator enriched text representations within Multi-Filter Residual Convolutional Neural Network (MultiResCNN) framework....

10.48550/arxiv.2411.06823 preprint EN arXiv (Cornell University) 2024-11-11

In this paper we apply multifractal formalism to the analysis of statistical behaviour topic models under condition varying number topics. Our reveals existence two self-similar regions and one transition region in function density-of-states depending on As earlier a that can be expressed through was successfully used determine optimal topics, test applicability for same purpose. We provide numerical results three (PLSA, ARTM, LDA Gibbs sampling) marked-up collections containing texts...

10.1088/1742-6596/1163/1/012025 article EN Journal of Physics Conference Series 2019-02-01

Image research has shown substantial attention in deblurring networks recent years. Yet, their practical usage real-world deblurring, especially motion blur, remains limited due to the lack of pixel-aligned training triplets (background, blurred image, and blur heat map) restricted information inherent images. This paper presents a simple yet efficient framework synthetic restore images using Inertial Measurement Unit (IMU) data. Notably, includes strategy for triplet generation,...

10.48550/arxiv.2402.06854 preprint EN arXiv (Cornell University) 2024-02-09

Background Pneumonia and lung cancer have a mutually reinforcing relationship. Lung patients are prone to contracting COVID-19, with poorer prognoses. Additionally, COVID-19 infection can impact anticancer treatments for patients. Developing an early diagnostic system pneumonia help improve the prognosis of infection. Method This study proposes neural network diagnosis based on non-enhanced CT scans, consisting two 3D convolutional networks (CNN) connected in series form modules. The first...

10.3389/fmed.2024.1444708 article EN cc-by Frontiers in Medicine 2024-08-12

For semantic analysis of activities and events in videos, it is important to capture the spatio-temporal relation among objects 3D space. In this paper, we present a probabilistic method that extracts trajectories from 2D captured monocular moving camera. Compared with existing methods rely on restrictive assumptions, propose can extract much less restriction by adopting new example-based techniques, which compensate lack information. Here, estimate focal length camera based similar...

10.1109/tcsvt.2017.2727963 article EN IEEE Transactions on Circuits and Systems for Video Technology 2017-07-17

In contrast to most of the English scientific publications that follow standard and simple layouts, order, content, position size metadata in German vary greatly among publications. This variety makes traditional NLP methods fail accurately extract from these this paper, we present a method extracts PDF documents with different layouts styles by viewing document as an image. We used Mask R-CNN which is trained on COCO dataset finetuned PubLayNet consists 200K snapshots five basic classes...

10.1109/jcdl52503.2021.00076 preprint EN 2021-09-01
Coming Soon ...