Alkis Simitsis

ORCID: 0009-0006-6078-5323
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Database Systems and Queries
  • Semantic Web and Ontologies
  • Data Quality and Management
  • Data Management and Algorithms
  • Cloud Computing and Resource Management
  • Service-Oriented Architecture and Web Services
  • Data Mining Algorithms and Applications
  • Scientific Computing and Data Management
  • Big Data and Business Intelligence
  • Business Process Modeling and Analysis
  • Distributed systems and fault tolerance
  • Web Data Mining and Analysis
  • Software System Performance and Reliability
  • Distributed and Parallel Computing Systems
  • Parallel Computing and Optimization Techniques
  • Graph Theory and Algorithms
  • Data Stream Mining Techniques
  • Advanced Data Storage Technologies
  • Peer-to-Peer Network Technologies
  • IoT and Edge/Fog Computing
  • Advanced Text Analysis Techniques
  • Image Processing and 3D Reconstruction
  • Anomaly Detection Techniques and Applications
  • Advanced Malware Detection Techniques
  • Advanced Graph Neural Networks

Athena Research and Innovation Center In Information Communication & Knowledge Technologies
2022-2024

Université Libre de Bruxelles
2024

Hewlett-Packard (United States)
2011-2021

Technical University of Crete
2020

National Technical University of Athens
2001-2011

IBM Research - Almaden
2007-2009

Stanford University
2007-2009

Hewlett Packard Enterprise (United States)
2009

IBM (United States)
2007-2008

Palo Alto University
2007-2008

Extraction-Transformation-Loading (ETL) tools are pieces of software responsible for the extraction data from several sources, their cleansing, customization and insertion into a warehouse. In this paper, we focus on problem definition ETL activities provide formal foundations conceptual representation. The proposed model is (a) customized tracing inter-attribute relationships respective in early stages warehouse project; (b) enriched with 'palette' set frequently used activities, like...

10.1145/583890.583893 article EN 2002-11-08

Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction data from several sources, their cleansing, customization and insertion into a warehouse. Usually, these processes must be completed in certain time window; thus, it is necessary to optimize execution time. In this paper, we delve logical optimization ETL processes, modeling as state-space search problem. We consider each workflow state fabricate space through set correct transitions....

10.1109/icde.2005.103 article EN 2005-04-19

Business Intelligence (BI) refers to technologies, tools, and practices for collecting, integrating, analyzing, presenting large volumes of information enable better decision making. Today's BI architecture typically consists a data warehouse (or one or more marts), which consolidates from several operational databases, serves variety front-end querying, reporting, analytic tools. The back-end the is integration pipeline populating by extracting distributed usually heterogeneous sources;...

10.1145/1516360.1516362 article EN 2009-03-24

This paper describes the convergence of some most influential technologies in last few years, namely data warehousing (DW), on-line analytical processing (OLAP), and Semantic Web (SW). OLAP is used by enterprises to derive important business-critical knowledge from inside company. However, interesting queries can no longer be answered on internal alone, external must also discovered (most often web), acquired, integrated, (analytically) queried, resulting a new type OLAP, exploratory OLAP....

10.1109/tkde.2014.2330822 article EN IEEE Transactions on Knowledge and Data Engineering 2014-06-19

As the web is increasingly used not only to find answers specific information needs but also carry out various tasks, enhancing capabilities of current search engines with effective and efficient techniques for service retrieval selection becomes an important issue. Existing matchmakers typically determine relevance between a advertisement request by computing overall score that aggregates individual matching scores among parameters in their descriptions. Two main drawbacks characterize such...

10.1109/tsc.2010.14 article EN IEEE Transactions on Services Computing 2010-05-10

One of the main tasks in early stages a data warehouse project is identification appropriate transformations and specification inter-schema mappings from sources to warehouse. In this article, we propose an ontology-based approach facilitate conceptual design back stage A graph-based representation used as model for datastores, so that both structured semi-structured are supported handled uniform way. The proposed based on use Semantic Web technologies semantically annotate warehouse,...

10.4018/jswis.2007100101 article EN International Journal on Semantic Web and Information Systems 2007-10-01

Next generation business intelligence involves data flows that span different execution engines, contain complex functionality like data/text analytics, machine learning operations, and need to be optimized against various objectives. Creating correct analytic in such an environment is a challenging task both labor-intensive time-consuming. Optimizing these currently ad-hoc process where the result largely dependent on abilities experience of flow designer. Our previous work addressed...

10.1145/2213836.2213963 article EN 2012-05-20

Extraction-transformation-loading (ETL) tools are pieces of software responsible for the extraction data from several sources, their cleansing, customization, and insertion into a warehouse. In this paper, we derive logical optimization ETL processes, modeling it as state-space search problem. We consider each workflow state fabricate space through set correct transitions. Moreover, provide an exhaustive two heuristic algorithms toward minimization execution cost workflow. The algorithm with...

10.1109/tkde.2005.169 article EN IEEE Transactions on Knowledge and Data Engineering 2005-08-30

One of the most important tasks performed in early stages a data warehouse project is analysis structure and content existing sources their intentional mapping to common model. Establishing appropriate mappings between attributes tables critical specifying required transformations an ETL workflow. The selected model should besuitable for facilitating redefinition revision efforts, typically occurring during phases project, serve as means communication involved parties. In this paper, we...

10.1145/1183512.1183526 article EN 2006-11-10

Active data warehousing has emerged as an alternative to conventional practices in order meet the high demand of applications for up-to-date information. In a nutshell, active warehouse is refreshed on-line and thus achieves higher consistency between stored information latest updates. The need refreshment introduces several challenges implementation transformations, with respect their execution time overhead processes. this paper, we focus on frequently encountered operation context,...

10.1109/icde.2007.367893 article EN 2007-04-01

Active data warehousing has emerged as an alternative to conventional practices in order meet the high demand of applications for up-to-date information. In a nutshell, active warehouse is refreshed online and thus achieves higher consistency between stored information latest updates. The need refreshment introduces several challenges implementation transformations, with respect their execution time overhead processes. this paper, we focus on frequently encountered operation context, namely,...

10.1109/tkde.2008.27 article EN IEEE Transactions on Knowledge and Data Engineering 2008-05-29

In this paper, we deal with the problem of determining best possible physical implementation an ETL workflow, given its logical-level description and appropriate cost model as inputs. We formulate a state-space provide suitable solution for task. further extend technique by intentionally introducing sorter activities in workflow order to search alternative implementations lower cost. experimentally assess our method based on principled organization test suites.

10.1145/1317331.1317341 article EN 2007-11-09

Many applications offer a form-based environment for nai¿ve users accessing databases without being familiar with the database schema or structured query language. User interactions are translated to queries and executed. However, as user is unlikely know underlying semantic connections among fields presented in form, it often useful provide her textual explanation of query. In this paper, we take graph-based approach translation problem. We represent various forms directed graphs annotate...

10.1109/icde.2010.5447824 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2010-01-01

As business intelligence becomes increasingly essential for organizations and as it evolves from strategic to operational, the complexity of Extract-Transform-Load (ETL) processes grows. In consequence, ETL engagements have become very time consuming, labor intensive, costly. At same time, additional requirements besides functionality performance need be considered in design processes. particular, quality needs determined by an intricate combination different metrics like reliability,...

10.1145/1559845.1559954 article EN 2009-06-29

Adversarial attacks pose a significant threat to data-driven systems, and researchers have spent considerable resources studying them. Despite its economic relevance, this trend largely overlooked the issue of credit card fraud detection. To address gap, we propose new model that demonstrates limitations existing highlights necessity investigate approaches. We then design adversarial attack for detection, employing reinforcement learning bypass classifiers. This attack, called FRAUD-RLA, is...

10.48550/arxiv.2502.02290 preprint EN arXiv (Cornell University) 2025-02-04

Extract-Transform-Load (ETL) activities are software modules responsible for populating a data warehouse with operational data, which have undergone series of transformations on their way to the warehouse. The whole process is very complex and signifi-cant importance design maintenance ware-house. A plethora commercial ETL tools already available in market. However, each one them follows different ap-proach modeling activities; i.e., building blocks an workflow. As result, so far there no...

10.1145/1651291.1651297 article EN 2009-11-06

Extract-Transform-Load (ETL) processes play an important role in data warehousing. Typically, design work on ETL has focused performance as the sole metric to make sure that process finishes within allocated time window. However, other quality metrics are also and need be considered during design. In this paper, we address for plus fault-tolerance freshness. There many reasons why can fail a good needs guarantee it recovered How robust failures is not trivial. different strategies used they...

10.1109/icde.2010.5447816 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2010-01-01

As we move from a Web of data to services, enhancing the capabilities current search engines with effective and efficient techniques for services retrieval selection becomes an important issue. Traditionally, relevance service advertisement request is determined by computing overall score that aggregates individual matching scores among various parameters in their descriptions. Two drawbacks characterize such approaches. First, there no single criterion optimal determining similarity between...

10.1145/1516360.1516463 article EN 2009-03-24
Coming Soon ...