Ahmed K. Elmagarmid

ORCID: 0000-0002-0044-458X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Database Systems and Queries
  • Distributed systems and fault tolerance
  • Data Quality and Management
  • Data Management and Algorithms
  • Data Mining Algorithms and Applications
  • Semantic Web and Ontologies
  • Video Analysis and Summarization
  • Privacy-Preserving Technologies in Data
  • Advanced Data Storage Technologies
  • Distributed and Parallel Computing Systems
  • Service-Oriented Architecture and Web Services
  • Advanced Image and Video Retrieval Techniques
  • Image Retrieval and Classification Techniques
  • Peer-to-Peer Network Technologies
  • Multimedia Communication and Technology
  • Optimization and Search Problems
  • Parallel Computing and Optimization Techniques
  • Time Series Analysis and Forecasting
  • Scientific Computing and Data Management
  • Data Stream Mining Techniques
  • Petri Nets in System Modeling
  • Caching and Content Delivery
  • Cryptography and Data Security
  • Mobile Agent-Based Network Management
  • Cloud Computing and Resource Management

Hamad bin Khalifa University
2016-2023

Qatar Cardiovascular Research Center
2013-2023

Qatar Airways (Qatar)
2012-2019

Purdue University West Lafayette
2005-2018

Qatar Foundation
2010-2017

Institute of Electrical and Electronics Engineers
2007

IBM Research - Thomas J. Watson Research Center
2007

Microsoft Research (United Kingdom)
2006

Hewlett-Packard (United States)
2002-2004

Pennsylvania State University
1985-2003

Synthesis of multiple randomized controlled trials (RCTs) in a systematic review can summarize the effects individual outcomes and provide numerical answers about effectiveness interventions. Filtering searches is time consuming, no single method fulfills principal requirements speed with accuracy. Automation reviews driven by necessity to expedite availability current best evidence for policy clinical decision-making. We developed Rayyan ( http://rayyan.qcri.org ), free web mobile app, that...

10.1186/s13643-016-0384-4 article EN cc-by Systematic Reviews 2016-12-01

Often, in the real world, entities have two or more representations databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching difficult task. Errors are introduced as result of transcription errors, incomplete information, lack standard formats, any combination these factors. In this paper, we present thorough analysis literature on record detection. We cover similarity metrics commonly used to detect similar field entries, and an...

10.1109/tkde.2007.9 article EN IEEE Transactions on Knowledge and Data Engineering 2007-01-01

Often, in the real world, entities have two or more representations databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching difficult task. Errors are introduced as result of transcription errors, incomplete information, lack standard formats, any combination these factors. In this paper, we present thorough analysis literature on record detection. We cover similarity metrics commonly used to detect similar field entries, and an...

10.1109/tkde.2007.250581 article EN IEEE Transactions on Knowledge and Data Engineering 2006-12-01

We propose a new automatic image segmentation method. Color edges in an are first obtained automatically by combining improved isotropic edge detector and fast entropic thresholding technique. After the color have provided major geometric structures image, centroids between these adjacent regions taken as initial seeds for seeded region growing (SRG). These then replaced of generated homogeneous incorporating required additional pixels step step. Moreover, results color-edge extraction SRG...

10.1109/83.951532 article EN IEEE Transactions on Image Processing 2001-01-01

Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection the confidentiality this has been a long-term goal for database security research community and government statistical agencies. Recent advances in mining machine learning algorithms have increased disclosure risks one may encounter when releasing to outside parties. A key problem, still not sufficiently investigated, is need balance disclosed with legitimate needs...

10.1109/tkde.2004.1269668 article EN IEEE Transactions on Knowledge and Data Engineering 2004-03-08

Data products (macrodata or tabular data and micro-data raw records), are designed to inform public business policy, research information. Securing these against unauthorized accesses has been a long-term goal of the database security community government statistical agencies. Solutions this problem require combining several techniques mechanisms. Recent advances in mining machine learning algorithms have, however, increased risks one may incur when releasing for from outside parties. Issues...

10.1109/kdex.1999.836532 article EN 2003-01-22

Despite the increasing importance of data quality and rich theoretical practical contributions in all aspects cleaning, there is no single end-to-end off-the-shelf solution to (semi-)automate detection repairing violations w.r.t. a set heterogeneous ad-hoc constraints. In short, commodity platform similar general purpose DBMSs that can be easily customized deployed solve application-specific problems. this paper, we present NADEEF, an extensible, generalized easy-to-deploy cleaning platform....

10.1145/2463676.2465327 article EN 2013-06-22

The importance of sleep is paramount for maintaining physical, emotional and mental wellbeing. Though the relationship between physical activity known to be important, it not yet fully understood. explosion in popularity actigraphy wearable devices, provides a unique opportunity understand this relationship. Leveraging information source requires new tools developed facilitate data-driven research patient-recommendations. In paper we explore use deep learning build quality prediction models...

10.2196/mhealth.6562 article EN cc-by JMIR mhealth and uhealth 2016-11-04

Data cleaning is a vital process that ensures the quality of data stored in real-world databases. problems are frequently encountered many research areas, such as knowledge discovery databases, warehousing, system integration and e-services. The identifying record pairs represent same entity (duplicate records), commonly known linkage, one essential elements cleaning. In this paper, we address linkage problem by adopting machine learning approach. Three models proposed analyzed empirically....

10.1109/icde.2002.994694 article EN 2003-06-25

The current trend in the application space towards systems of loosely coupled and dynamically bound components that enables just-in-time integration jeopardizes security information is shared between broker, requester, provider at runtime. In particular, new advances data mining knowledge discovery allow for extraction hidden an enormous amount data, impose threats on seamless information. We consider problem building privacy preserving algorithms one category techniques, association rule...

10.1109/ride.2002.995109 article EN 2003-06-25

Periodicity mining is used for predicting trends in time series data. Discovering the rate at which periodic has always been an obstacle fully automated periodicity mining. Existing algorithms assume that periodicity, (or simply period) user-specified. This assumption a considerable limitation, especially data where period not known priori. In this paper, we address problem of detecting database. Two types periodicities are defined, and scalable, computationally efficient algorithm proposed...

10.1109/tkde.2005.114 article EN IEEE Transactions on Knowledge and Data Engineering 2005-05-24

In this paper we present GDR, a Guided Data Repair framework that incorporates user feedback in the cleaning process to enhance and accelerate existing automatic repair techniques while minimizing involvement. GDR consults on updates are most likely be beneficial improving data quality. also uses machine learning methods identify apply correct directly database without actual involvement of these specific updates. To rank potential for consultation by user, first group repairs quantify...

10.14778/1952376.1952378 article EN Proceedings of the VLDB Endowment 2011-02-01

10.1023/a:1013616213333 article EN Mobile Networks and Applications 1997-01-01

Integrating data from multiple sources has been a longstanding challenge in the database community. Techniques such as privacy-preserving mining promises privacy, but assume integration accomplished. Data methods are seriously hampered by inability to share be integrated. This paper lays out privacy framework for integration. Challenges context of this discussed, existing accomplishments Many these challenges opportunities

10.1145/1008694.1008698 article EN 2004-06-13

In many business scenarios, record matching is performed across different data sources with the aim of identifying common information shared among these sources. However such need often in contrast privacy requirements concerning stored by this paper, we propose a protocol for that preserves both at level and schema level. Specifically, if two to identify their data, running they can compute datasets without sharing clear only result matching. The uses third party, maps records into vector...

10.1145/1247480.1247553 article EN 2007-06-11

In relational database management systems, views supplement basic query constructs to cope with the demand for “higher-level” of data. Moreover, in traditional optimization, answering a using set existing materialized can yield more efficient execution plan. Due their effectiveness, are attractive data stream systems. order support over streams, system should employ closed (or composable) continuous language. A language is which inputs and outputs interpreted same way, hence allowing...

10.1145/1670243.1670244 article EN ACM Transactions on Database Systems 2008-02-15

Various computational procedures or constraint-based methods for data repairing have been proposed over the last decades to identify errors and, when possible, correct them. However, these approaches several limitations including scalability and quality of values be used in replacement errors. In this paper, we propose a new approach that is based on maximizing likelihood given distribution, which can modeled using statistical machine learning techniques. This novel combining cleaning dirty...

10.1145/2463676.2463706 preprint EN 2013-06-22
Coming Soon ...