- Advanced Database Systems and Queries
- Semantic Web and Ontologies
- Data Quality and Management
- Data Management and Algorithms
- Cloud Computing and Resource Management
- Service-Oriented Architecture and Web Services
- Data Mining Algorithms and Applications
- Scientific Computing and Data Management
- Big Data and Business Intelligence
- Business Process Modeling and Analysis
- Distributed systems and fault tolerance
- Web Data Mining and Analysis
- Software System Performance and Reliability
- Distributed and Parallel Computing Systems
- Parallel Computing and Optimization Techniques
- Graph Theory and Algorithms
- Data Stream Mining Techniques
- Advanced Data Storage Technologies
- Peer-to-Peer Network Technologies
- IoT and Edge/Fog Computing
- Advanced Text Analysis Techniques
- Image Processing and 3D Reconstruction
- Anomaly Detection Techniques and Applications
- Advanced Malware Detection Techniques
- Advanced Graph Neural Networks
Athena Research and Innovation Center In Information Communication & Knowledge Technologies
2022-2024
Université Libre de Bruxelles
2024
Hewlett-Packard (United States)
2011-2021
Technical University of Crete
2020
National Technical University of Athens
2001-2011
IBM Research - Almaden
2007-2009
Stanford University
2007-2009
Hewlett Packard Enterprise (United States)
2009
IBM (United States)
2007-2008
Palo Alto University
2007-2008
Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction data from several sources, their cleansing, customization and insertion into a warehouse. In this paper, we focus on problem definition ETL activities provide formal foundations conceptual representation. The proposed model is (a) customized tracing inter-attribute relationships respective in early stages warehouse project; (b) enriched with 'palette' set frequently used activities, like...
Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction data from several sources, their cleansing, customization and insertion into a warehouse. Usually, these processes must be completed in certain time window; thus, it is necessary to optimize execution time. In this paper, we delve logical optimization ETL processes, modeling as state-space search problem. We consider each workflow state fabricate space through set correct transitions....
Business Intelligence (BI) refers to technologies, tools, and practices for collecting, integrating, analyzing, presenting large volumes of information enable better decision making. Today's BI architecture typically consists a data warehouse (or one or more marts), which consolidates from several operational databases, serves variety front-end querying, reporting, analytic tools. The back-end the is integration pipeline populating by extracting distributed usually heterogeneous sources;...
This paper describes the convergence of some most influential technologies in last few years, namely data warehousing (DW), on-line analytical processing (OLAP), and Semantic Web (SW). OLAP is used by enterprises to derive important business-critical knowledge from inside company. However, interesting queries can no longer be answered on internal alone, external must also discovered (most often web), acquired, integrated, (analytically) queried, resulting a new type OLAP, exploratory OLAP....
As the web is increasingly used not only to find answers specific information needs but also carry out various tasks, enhancing capabilities of current search engines with effective and efficient techniques for service retrieval selection becomes an important issue. Existing matchmakers typically determine relevance between a advertisement request by computing overall score that aggregates individual matching scores among parameters in their descriptions. Two main drawbacks characterize such...
One of the main tasks in early stages a data warehouse project is identification appropriate transformations and specification inter-schema mappings from sources to warehouse. In this article, we propose an ontology-based approach facilitate conceptual design back stage A graph-based representation used as model for datastores, so that both structured semi-structured are supported handled uniform way. The proposed based on use Semantic Web technologies semantically annotate warehouse,...
Next generation business intelligence involves data flows that span different execution engines, contain complex functionality like data/text analytics, machine learning operations, and need to be optimized against various objectives. Creating correct analytic in such an environment is a challenging task both labor-intensive time-consuming. Optimizing these currently ad-hoc process where the result largely dependent on abilities experience of flow designer. Our previous work addressed...
Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction data from several sources, their cleansing, customization, and insertion into a warehouse. In this paper, we derive logical optimization ETL processes, modeling it as state-space search problem. We consider each workflow state fabricate space through set correct transitions. Moreover, provide an exhaustive two heuristic algorithms toward minimization execution cost workflow. The algorithm with...
One of the most important tasks performed in early stages a data warehouse project is analysis structure and content existing sources their intentional mapping to common model. Establishing appropriate mappings between attributes tables critical specifying required transformations an ETL workflow. The selected model should besuitable for facilitating redefinition revision efforts, typically occurring during phases project, serve as means communication involved parties. In this paper, we...
Active data warehousing has emerged as an alternative to conventional practices in order meet the high demand of applications for up-to-date information. In a nutshell, active warehouse is refreshed on-line and thus achieves higher consistency between stored information latest updates. The need refreshment introduces several challenges implementation transformations, with respect their execution time overhead processes. this paper, we focus on frequently encountered operation context,...
Active data warehousing has emerged as an alternative to conventional practices in order meet the high demand of applications for up-to-date information. In a nutshell, active warehouse is refreshed online and thus achieves higher consistency between stored information latest updates. The need refreshment introduces several challenges implementation transformations, with respect their execution time overhead processes. this paper, we focus on frequently encountered operation context, namely,...
In this paper, we deal with the problem of determining best possible physical implementation an ETL workflow, given its logical-level description and appropriate cost model as inputs. We formulate a state-space provide suitable solution for task. further extend technique by intentionally introducing sorter activities in workflow order to search alternative implementations lower cost. experimentally assess our method based on principled organization test suites.
Many applications offer a form-based environment for nai¿ve users accessing databases without being familiar with the database schema or structured query language. User interactions are translated to queries and executed. However, as user is unlikely know underlying semantic connections among fields presented in form, it often useful provide her textual explanation of query. In this paper, we take graph-based approach translation problem. We represent various forms directed graphs annotate...
As business intelligence becomes increasingly essential for organizations and as it evolves from strategic to operational, the complexity of Extract-Transform-Load (ETL) processes grows. In consequence, ETL engagements have become very time consuming, labor intensive, costly. At same time, additional requirements besides functionality performance need be considered in design processes. particular, quality needs determined by an intricate combination different metrics like reliability,...
Adversarial attacks pose a significant threat to data-driven systems, and researchers have spent considerable resources studying them. Despite its economic relevance, this trend largely overlooked the issue of credit card fraud detection. To address gap, we propose new model that demonstrates limitations existing highlights necessity investigate approaches. We then design adversarial attack for detection, employing reinforcement learning bypass classifiers. This attack, called FRAUD-RLA, is...
Extract-Transform-Load (ETL) activities are software modules responsible for populating a data warehouse with operational data, which have undergone series of transformations on their way to the warehouse. The whole process is very complex and signifi-cant importance design maintenance ware-house. A plethora commercial ETL tools already available in market. However, each one them follows different ap-proach modeling activities; i.e., building blocks an workflow. As result, so far there no...
Extract-Transform-Load (ETL) processes play an important role in data warehousing. Typically, design work on ETL has focused performance as the sole metric to make sure that process finishes within allocated time window. However, other quality metrics are also and need be considered during design. In this paper, we address for plus fault-tolerance freshness. There many reasons why can fail a good needs guarantee it recovered How robust failures is not trivial. different strategies used they...
As we move from a Web of data to services, enhancing the capabilities current search engines with effective and efficient techniques for services retrieval selection becomes an important issue. Traditionally, relevance service advertisement request is determined by computing overall score that aggregates individual matching scores among various parameters in their descriptions. Two drawbacks characterize such approaches. First, there no single criterion optimal determining similarity between...