Jana Giceva

ORCID: 0000-0002-1926-3551
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Data Storage Technologies
  • Cloud Computing and Resource Management
  • Advanced Database Systems and Queries
  • Distributed systems and fault tolerance
  • Parallel Computing and Optimization Techniques
  • Distributed and Parallel Computing Systems
  • Caching and Content Delivery
  • Scientific Computing and Data Management
  • Graph Theory and Algorithms
  • Data Management and Algorithms
  • Advanced Graph Neural Networks
  • Embedded Systems Design Techniques
  • Big Data and Business Intelligence
  • Algorithms and Data Compression
  • Logic, programming, and type systems
  • Software-Defined Networks and 5G
  • Statistics Education and Methodologies
  • Mathematics, Computing, and Information Processing
  • Internet Traffic Analysis and Secure E-voting
  • Interconnection Networks and Systems
  • Open Education and E-Learning
  • Peer-to-Peer Network Technologies
  • Web Data Mining and Analysis
  • Advanced Memory and Neural Computing
  • Data Stream Mining Techniques

Technical University of Munich
2019-2024

Imperial College London
2018-2019

ETH Zurich
2011-2017

University of Cambridge
2015

Constructor University
2009

In this paper we present BatchDB, an in-memory database engine designed for hybrid OLTP and OLAP workloads. BatchDB achieves good performance, provides a high level of data freshness, minimizes load interaction between the transactional analytical engines, thus enabling real time analysis over fresh under tight SLAs both

10.1145/3035918.3035959 article EN 2017-05-09

Multicore computers pose a substantial challenge to infrastructure software such as operating systems or databases. Such typically evolves slower than the underlying hardware, and with multicore it faces structural limitations that can be solved only radical architectural changes. In this paper we argue that, has been suggested for systems, databases could treat architectures distributed system rather trying hide parallel nature of hardware. We first analyze database engines when running on...

10.1145/1966445.1966448 article EN 2011-04-10

Implementing parallel operators in multi-core machines often involves a data partitioning step that divides the into cache-size blocks and arranges them so to allow concurrent threads process parallel. Data is expensive, some cases up 90% of cost of, e.g., hash join. In this paper we explore use an FPGA accelerate partitioning. We do context new hybrid architectures where located as co-processor residing on socket with coherent access same memory CPU other socket. Such architecture reduces...

10.1145/3035918.3035946 article EN 2017-05-09

An efficient implementation of a hash join has been highly researched problem for decades. Recently, the radix shown to have superior performance over alternatives (e.g., non-partitioned join), albeit on synthetic microbenchmarks. Therefore, it is unclear whether one can simply replace in an RDBMS or use as booster selected queries. If latter, still unknown when should rely improve performance.

10.1145/3448016.3452831 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Efficient resource scheduling of multithreaded software on multicore hardware is difficult given the many parameters involved and heterogeneity existing systems. In this paper we explore efficient deployment query plans over a machine. We focus shared systems, implement proposed ideas using SharedDB. The goal to how deliver maximum performance predictability, while minimizing utilization when deploying machines. propose use activity vectors characterize behavior individual database...

10.14778/2735508.2735513 article EN Proceedings of the VLDB Endowment 2014-11-01

Since its invention, data-centric code generation has been adopted for query compilation by various database systems in academia and industry. These are fast but maximize performance at the expense of developer friendliness, flexibility, extensibility. Recent advances field compiler construction identified similar issues domain-specific compilers introduced a solution with MLIR, generic infrastructure dialects. We propose layered stack based on MLIR open intermediate representations that can...

10.14778/3551793.3551801 article EN Proceedings of the VLDB Endowment 2022-07-01

Despite the wide adoption of graph processing across many different application domains, there is no underlying data structure that can serve a variety workloads (analytics, traversals, and pattern matching) on dynamic graphs with transactional updates. In this paper, we present Sortledton, universal addresses open problem by being carefully optimizing for most relevant access patterns used computation kernels. It support millions updates per second, while providing competitive performance...

10.14778/3514061.3514065 article EN Proceedings of the VLDB Endowment 2022-02-01

Persistent memory (PMem) promised DRAM-like performance, byte addressability, and the persistency guarantees of conventional block storage. With release Intel Optane DCPMM, those expectations were dampened. While its write latency competes with DRAM, read latency, endurance, especially bandwidth fall behind by up to an order magnitude. Established PMem index structures mostly focus on lookups cannot leverage PMem's low latency. For inserts, DRAM-optimized are still magnitude faster than...

10.14778/3551793.3551839 article EN Proceedings of the VLDB Endowment 2022-07-01

Current hardware development greatly influences the design decisions of modern database systems. For many performance-focused systems, query compilation emerged as an integral part and different approaches for code generation evolved, making use standard compilers, general-purpose compiler libraries, or domain-specific generators. However, primarily focused on dominating x86-64 server architecture; but neglected current developments towards other CPU architectures like ARM RISC...

10.14778/3583140.3583142 article EN Proceedings of the VLDB Endowment 2023-02-01

Multi-core scalability is one of the most important features for database systems running on today's hardware. Not surprisingly, implementation locks paramount to achieving efficient and scalable synchronization. In this work, we identify key database-specific requirements lock implementations evaluate them using both micro-benchmarks full-fledged workloads. The results indicate that optimistic locking has superior performance in workloads due its minimal overhead latency. By complementing...

10.1145/3399666.3399908 article EN 2020-06-04

For decades, database engines have found the generic interfaces offered by operating systems at odds with need for efficient utilization of hardware resources. As a result, most circumvent OS and manage directly. With growing complexity heterogeneity modern hardware, are now facing steep increase in they must absorb to achieve good performance. Taking advantage recent proposals system design, such as multi-kernels, this paper we explore development light weight kernel tailored data...

10.1145/2933349.2933351 article EN 2016-06-01

The rising hardware heterogeneity in modern systems emphasizes new dimensions of optimizing task execution for data processing frameworks. Specialized is often expected to be the exclusive executor some particular workload because it was designed or simply fastest option. In heterogeneous database systems, almost always, entire operation offloading considered. However, little attention given with horizontal cross-device pipeline parallelization. We argue that such an approach can applied...

10.1145/3662010.3663441 article EN 2024-05-30

A wealth of technology has evolved around relational databases over decades that been successfully tried and tested in many settings use cases. Yet, the majority it remains overlooked pursuit performance (e.g., NoSQL) or new functionality graph data machine learning). In this paper, we argue a wide range techniques readily available are crucial to tackling challenges IT industry faces terms hardware trends management, growing workloads, overall complexity rapidly changing application...

10.14778/3476249.3476296 article EN Proceedings of the VLDB Endowment 2021-07-01

Dataflow graphs are a popular abstraction for describing computation, used in many systems high-level optimization. For execution, dataflow lowered and optimized through layers of program representations down to machine instructions. Unfortunately, performance profiling such is cumbersome, as today's profilers present results merely at instruction function granularity. This obfuscates the connection between profiles constructs, operators pipelines, making interpretation an exercise puzzling...

10.1145/3447786.3456254 article EN 2021-04-21

Despite the wide adoption of graph processing across many different application domains, there is no underlying data structure that can serve a variety workloads (analytics, traversals, and pattern matching) on dynamic graphs with single edge updates updates.

10.1145/3604437.3604442 article EN ACM SIGMOD Record 2023-06-07

With full resource disaggregation on the horizon, it is unclear what most suitable programming model that enables dataflow developers to fully harvest potential recent hardware developments offer. In our vision, we propose raise abstraction level allow primarily reason about their and requirements need be met by underlying system in a declarative fashion. Underneath, works with typed memory regions uses notion of ownership allows for more flexible management across different compute devices...

10.1145/3593856.3595889 article EN 2023-06-22

Data processing systems face the challenge of supporting increasingly diverse workloads efficiently. At same time, they are already bloated with internal complexity, and it is not clear how new hardware can be supported sustainably. In this paper, we aim to resolve these issues by proposing a unified abstraction layer based on declarative sub-operators in addition relational operators. By exposing users, express their non-relational declaratively sub-operators. Furthermore, proposed decouple...

10.14778/3611479.3611539 article EN Proceedings of the VLDB Endowment 2023-07-01

With today's data deluge, approximate filters are particularly attractive to avoid expensive operations like remote data/disk accesses. Among the many filter variants available, it is non-trivial find most suitable one and its optimal configuration for a specific use-case. We provide open-source implementations relevant (Bloom, Cuckoo, Morton, Xor filters) compare them in four key dimensions: false-positive rate, space consumption, build, lookup throughput. improve upon existing...

10.14778/3476249.3476286 article EN Proceedings of the VLDB Endowment 2021-07-01

Efficiently finding subgraph embeddings in large graphs is crucial for many application areas like biology and social network analysis. Set intersections are the predominant most challenging aspect of current join-based query processing systems CPUs. Previous work has shown viability utilizing FPGAs acceleration graph join processing. In this work, we propose GraphMatch, first genearl-purpose stand-alone accelerator based on worst-case optimal joins (WCOJ) that fully designed modern, field...

10.48550/arxiv.2402.17559 preprint EN arXiv (Cornell University) 2024-02-27
Coming Soon ...