- Privacy-Preserving Technologies in Data
- Data Quality and Management
- Cryptography and Data Security
- Internet Traffic Analysis and Secure E-voting
- Online Learning and Analytics
- Imbalanced Data Classification Techniques
- Data Mining Algorithms and Applications
- Online and Blended Learning
- Advanced Database Systems and Queries
- Data Management and Algorithms
- Cloud Data Security Solutions
- Data-Driven Disease Surveillance
- Adversarial Robustness in Machine Learning
- Privacy, Security, and Data Protection
- Human Mobility and Location-Based Analysis
- Experimental Learning in Engineering
- Topic Modeling
- Artificial Intelligence in Healthcare and Education
- Data Stream Mining Techniques
- Access Control and Trust
- Intelligent Tutoring Systems and Adaptive Learning
- Semantic Web and Ontologies
- Parallel Computing and Optimization Techniques
- Innovative Teaching and Learning Methods
- Vehicular Ad Hoc Networks (VANETs)
Hellenic Open University
2016-2025
National and Kapodistrian University of Athens
2024
Athens Eye Hospital
2024
National Statistical Institute of Portugal
2022
IBM (United States)
2019
University of Thessaly
2003-2013
Research Academic Computer Technology Institute
1994-2008
IEEE Computer Society
2006-2007
Drexel University
2000-2003
Purdue University West Lafayette
1997-2003
Often, in the real world, entities have two or more representations databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching difficult task. Errors are introduced as result of transcription errors, incomplete information, lack standard formats, any combination these factors. In this paper, we present thorough analysis literature on record detection. We cover similarity metrics commonly used to detect similar field entries, and an...
Often, in the real world, entities have two or more representations databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching difficult task. Errors are introduced as result of transcription errors, incomplete information, lack standard formats, any combination these factors. In this paper, we present thorough analysis literature on record detection. We cover similarity metrics commonly used to detect similar field entries, and an...
We provide here an overview of the new and rapidly emerging research area privacy preserving data mining. also propose a classification hierarchy that sets basis for analyzing work which has been performed in this context. A detailed review accomplished is given, along with coordinates each to hierarchy. brief evaluation performed, some initial conclusions are made.
Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection the confidentiality this has been a long-term goal for database security research community and government statistical agencies. Recent advances in mining machine learning algorithms have increased disclosure risks one may encounter when releasing to outside parties. A key problem, still not sufficiently investigated, is need balance disclosed with legitimate needs...
Data products (macrodata or tabular data and micro-data raw records), are designed to inform public business policy, research information. Securing these against unauthorized accesses has been a long-term goal of the database security community government statistical agencies. Solutions this problem require combining several techniques mechanisms. Recent advances in mining machine learning algorithms have, however, increased risks one may incur when releasing for from outside parties. Issues...
Smart cities, leveraging advanced data analytics, predictive models, and digital twin techniques, offer a transformative model for sustainable urban development. Predictive analytics is critical to proactive planning, enabling cities adapt evolving challenges. Concurrently, techniques provide virtual replica of the environment, fostering real-time monitoring, simulation, analysis systems. This study underscores significance systems support test scenarios that identify bottlenecks enhance...
Data mining technology has given us new capabilities to identify correlations in large data sets. This introduces risks when the is be made public, but are private. We introduce a method for selectively removing individual values from database prevent discovery of set rules, while preserving other applications. The efficacy and complexity this discussed. also present an experiment showing example methodology.
Data cleaning is a vital process that ensures the quality of data stored in real-world databases. problems are frequently encountered many research areas, such as knowledge discovery databases, warehousing, system integration and e-services. The identifying record pairs represent same entity (duplicate records), commonly known linkage, one essential elements cleaning. In this paper, we address linkage problem by adopting machine learning approach. Three models proposed analyzed empirically....
The current trend in the application space towards systems of loosely coupled and dynamically bound components that enables just-in-time integration jeopardizes security information is shared between broker, requester, provider at runtime. In particular, new advances data mining knowledge discovery allow for extraction hidden an enormous amount data, impose threats on seamless information. We consider problem building privacy preserving algorithms one category techniques, association rule...
Smart cities, leveraging advanced data analytics, predictive models, and digital twin techniques, offer a transformative model for sustainable urban development. Predictive analytics plays crucial role in proactive planning, enabling cities to adapt evolving challenges. Concurrently, techniques provide virtual replica of the environment, fostering real-time monitoring, simulation, analysis systems. This research underscores significance systems support test scenarios that identify...
This nationwide study aims to analyze mortality trends for all individual causes in Greece from 2001 2020, with a specific focus on year influenced by the COVID-19 pandemic. As is fastest-aging country Europe, study's findings can be generalized other aging societies, guiding reevaluation of global health policies.
The rapid growth of transactional data brought, soon enough, into attention the need its further exploitation. In this paper, we investigate problem securing sensitive knowledge from being exposed in patterns extracted during association rule mining. Instead hiding produced rules directly, decide to hide frequent itemsets that may lead production these rules. As a first step, introduce notion distance between two databases and measure for quantifying it. By trying minimize original database...
In this paper, we propose a novel, exact border-based approach that provides an optimal solution for the hiding of sensitive frequent itemsets by (i) minimally extending original database synthetically generated part - extension, (ii) formulating creation extension as constraint satisfaction problem, (iii) mapping problem to equivalent binary integer programming (iv) exploiting underutilized synthetic transactions proportionally increase support non-sensitive itemsets, (v) relaxing provide...
The offering of anonymity in relational databases has attracted a great deal attention the database community during last decade [4]. Among different solution approaches that have been proposed to tackle this problem, K-anonymity received increased and extensively studied various forms. New forms data come into existence, like location capturing user movement, pave way for cutting edge services such as prevailing Location Based Services (LBSs). Given these assume an in-depth knowledge mobile...
We present a Λ-fold Redundant Blocking Framework, that relies on the Locality-Sensitive Hashing technique for identifying candidate record pairs, which have undergone an anonymization transformation. In this context, we demonstrate usage and evaluate performance of variety families hash functions used blocking. illustrate attained is highly correlated to distance-preserving properties format used. The parameters, blocking scheme, are optimally selected so achieve highest possible accuracy in...
Artificial Intelligence (AI) has shown the ability to enhance accuracy and efficiency of physicians. ChatGPT is an AI chatbot that can interact with humans through text, over internet. It trained machine learning algorithms, using large datasets. In this study, we compare performance a API 3.5 Turbo model general model, in assisting urologists obtaining accurate, valid medical information. The was accessed Python script applied specifically for study based on 2023 EAU guidelines PDF format....