NFDI4DS | UHH-SEMS - Publication Details

Jana Giceva

ORCID: 0000-0002-1926-3551

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5067560519

Research Areas

Advanced Data Storage Technologies
Cloud Computing and Resource Management
Advanced Database Systems and Queries
Distributed systems and fault tolerance
Parallel Computing and Optimization Techniques
Distributed and Parallel Computing Systems
Caching and Content Delivery
Scientific Computing and Data Management
Graph Theory and Algorithms
Data Management and Algorithms
Advanced Graph Neural Networks
Embedded Systems Design Techniques
Big Data and Business Intelligence
Algorithms and Data Compression
Logic, programming, and type systems
Software-Defined Networks and 5G
Statistics Education and Methodologies
Mathematics, Computing, and Information Processing
Internet Traffic Analysis and Secure E-voting
Interconnection Networks and Systems
Open Education and E-Learning
Peer-to-Peer Network Technologies
Web Data Mining and Analysis
Advanced Memory and Neural Computing
Data Stream Mining Techniques

Technical University of Munich
2019-2024

Imperial College London
2018-2019

ETH Zurich
2011-2017

University of Cambridge
2015

Constructor University
2009

BatchDB

OPENALEX - Publications

Darko Makreshanski Jana Giceva Claude Barthels Gustavo Alonso

In this paper we present BatchDB, an in-memory database engine designed for hybrid OLTP and OLAP workloads. BatchDB achieves good performance, provides a high level of data freshness, minimizes load interaction between the transactional analytical engines, thus enabling real time analysis over fresh under tight SLAs both

10.1145/3035918.3035959 article EN 2017-05-09

Database engines on multicores, why parallelize when you can distribute?

OPENALEX - Publications

Tudor-Ioan Salomie Ionut Emanuel Subasu Jana Giceva Gustavo Alonso

Multicore computers pose a substantial challenge to infrastructure software such as operating systems or databases. Such typically evolves slower than the underlying hardware, and with multicore it faces structural limitations that can be solved only radical architectural changes. In this paper we argue that, has been suggested for systems, databases could treat architectures distributed system rather trying hide parallel nature of hardware. We first analyze database engines when running on...

10.1145/1966445.1966448 article EN 2011-04-10

FPGA-based Data Partitioning

OPENALEX - Publications

Kaan Kara Jana Giceva Gustavo Alonso

Implementing parallel operators in multi-core machines often involves a data partitioning step that divides the into cache-size blocks and arranges them so to allow concurrent threads process parallel. Data is expensive, some cases up 90% of cost of, e.g., hash join. In this paper we explore use an FPGA accelerate partitioning. We do context new hybrid architectures where located as co-processor residing on socket with coherent access same memory CPU other socket. Such architecture reduces...

10.1145/3035918.3035946 article EN 2017-05-09

To Partition, or Not to Partition, That is the Join Question in a Real System

OPENALEX - Publications

Maximilian Bandle Jana Giceva Thomas Neumann

An efficient implementation of a hash join has been highly researched problem for decades. Recently, the radix shown to have superior performance over alternatives (e.g., non-partitioned join), albeit on synthetic microbenchmarks. Therefore, it is unclear whether one can simply replace in an RDBMS or use as booster selected queries. If latter, still unknown when should rely improve performance.

10.1145/3448016.3452831 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Deployment of query plans on multicores

OPENALEX - Publications

Jana Giceva Gustavo Alonso Timothy Roscoe Tim Harris

Efficient resource scheduling of multithreaded software on multicore hardware is difficult given the many parameters involved and heterogeneity existing systems. In this paper we explore efficient deployment query plans over a machine. We focus shared systems, implement proposed ideas using SharedDB. The goal to how deliver maximum performance predictability, while minimizing utilization when deploying machines. propose use activity vectors characterize behavior individual database...

10.14778/2735508.2735513 article EN Proceedings of the VLDB Endowment 2014-11-01

Designing an open framework for query optimization and compilation

OPENALEX - Publications

Michael Jungmair André Kohn Jana Giceva

Since its invention, data-centric code generation has been adopted for query compilation by various database systems in academia and industry. These are fast but maximize performance at the expense of developer friendliness, flexibility, extensibility. Recent advances field compiler construction identified similar issues domain-specific compilers introduced a solution with MLIR, generic infrastructure dialects. We propose layered stack based on MLIR open intermediate representations that can...

10.14778/3551793.3551801 article EN Proceedings of the VLDB Endowment 2022-07-01

Sortledton

OPENALEX - Publications

Per Fuchs Domagoj Margan Jana Giceva

Despite the wide adoption of graph processing across many different application domains, there is no underlying data structure that can serve a variety workloads (analytics, traversals, and pattern matching) on dynamic graphs with transactional updates. In this paper, we present Sortledton, universal addresses open problem by being carefully optimizing for most relevant access patterns used computation kernels. It support millions updates per second, while providing competitive performance...

10.14778/3514061.3514065 article EN Proceedings of the VLDB Endowment 2022-02-01

Plush

OPENALEX - Publications

Lukas Vogel Alexander van Renen Satoshi Imamura Jana Giceva Thomas Neumann and 1 more

Persistent memory (PMem) promised DRAM-like performance, byte addressability, and the persistency guarantees of conventional block storage. With release Intel Optane DCPMM, those expectations were dampened. While its write latency competes with DRAM, read latency, endurance, especially bandwidth fall behind by up to an order magnitude. Established PMem index structures mostly focus on lookups cannot leverage PMem's low latency. For inserts, DRAM-optimized are still magnitude faster than...

10.14778/3551793.3551839 article EN Proceedings of the VLDB Endowment 2022-07-01

Bringing Compiling Databases to RISC Architectures

OPENALEX - Publications

Ferdinand Gruber Maximilian Bandle Alexis Engelke Thomas Neumann Jana Giceva

Current hardware development greatly influences the design decisions of modern database systems. For many performance-focused systems, query compilation emerged as an integral part and different approaches for code generation evolved, making use standard compilers, general-purpose compiler libraries, or domain-specific generators. However, primarily focused on dominating x86-64 server architecture; but neglected current developments towards other CPU architectures like ARM RISC...

10.14778/3583140.3583142 article EN Proceedings of the VLDB Endowment 2023-02-01

Scalable and robust latches for database systems

OPENALEX - Publications

Jan P. Böttcher Viktor Leis Jana Giceva Thomas Neumann Alfons Kemper

Multi-core scalability is one of the most important features for database systems running on today's hardware. Not surprisingly, implementation locks paramount to achieving efficient and scalable synchronization. In this work, we identify key database-specific requirements lock implementations evaluate them using both micro-benchmarks full-fledged workloads. The results indicate that optimistic locking has superior performance in workloads due its minimal overhead latency. By complementing...

10.1145/3399666.3399908 article EN 2020-06-04

Customized OS support for data-processing

OPENALEX - Publications

Jana Giceva Gerd Zellweger Gustavo Alonso Timothy Rosco

For decades, database engines have found the generic interfaces offered by operating systems at odds with need for efficient utilization of hardware resources. As a result, most circumvent OS and manage directly. With growing complexity heterogeneity modern hardware, are now facing steep increase in they must absorb to achieve good performance. Taking advantage recent proposals system design, such as multi-kernels, this paper we explore development light weight kernel tailored data...

10.1145/2933349.2933351 article EN 2016-06-01

Heterogeneous Intra-Pipeline Device-Parallel Aggregations

OPENALEX - Publications

Artem Kroviakov Petr Kurapov Christoph Anneser Jana Giceva

The rising hardware heterogeneity in modern systems emphasizes new dimensions of optimizing task execution for data processing frameworks. Specialized is often expected to be the exclusive executor some particular workload because it was designed or simply fastest option. In heterogeneous database systems, almost always, entire operation offloading considered. However, little attention given with horizontal cross-device pipeline parallelization. We argue that such an approach can applied...

10.1145/3662010.3663441 article EN 2024-05-30

Database technology for the masses

OPENALEX - Publications

Maximilian Bandle Jana Giceva

A wealth of technology has evolved around relational databases over decades that been successfully tried and tested in many settings use cases. Yet, the majority it remains overlooked pursuit performance (e.g., NoSQL) or new functionality graph data machine learning). In this paper, we argue a wide range techniques readily available are crucial to tackling challenges IT industry faces terms hardware trends management, growing workloads, overall complexity rapidly changing application...

10.14778/3476249.3476296 article EN Proceedings of the VLDB Endowment 2021-07-01

Profiling dataflow systems on multiple abstraction levels

OPENALEX - Publications

Alexander Beischl Timo Kersten Maximilian Bandle Jana Giceva Thomas Neumann

Dataflow graphs are a popular abstraction for describing computation, used in many systems high-level optimization. For execution, dataflow lowered and optimized through layers of program representations down to machine instructions. Unfortunately, performance profiling such is cumbersome, as today's profilers present results merely at instruction function granularity. This obfuscates the connection between profiles constructs, operators pipelines, making interpretation an exercise puzzling...

10.1145/3447786.3456254 article EN 2021-04-21

Sortledton: a Universal Graph Data Structure

OPENALEX - Publications

Per Fuchs Domagoj Margan Jana Giceva

10.1145/3604437.3604442 article EN ACM SIGMOD Record 2023-06-07

Programming Fully Disaggregated Systems

OPENALEX - Publications

Christoph Anneser Lukas Vogel Ferdinand Gruber Maximilian Bandle Jana Giceva

With full resource disaggregation on the horizon, it is unclear what most suitable programming model that enables dataflow developers to fully harvest potential recent hardware developments offer. In our vision, we propose raise abstraction level allow primarily reason about their and requirements need be met by underlying system in a declarative fashion. Underneath, works with typed memory regions uses notion of ownership allows for more flexible management across different compute devices...

10.1145/3593856.3595889 article EN 2023-06-22

Declarative Sub-Operators for Universal Data Processing

OPENALEX - Publications

Michael Jungmair Jana Giceva

Data processing systems face the challenge of supporting increasingly diverse workloads efficiently. At same time, they are already bloated with internal complexity, and it is not clear how new hardware can be supported sustainably. In this paper, we aim to resolve these issues by proposing a unified abstraction layer based on declarative sub-operators in addition relational operators. By exposing users, express their non-relational declaratively sub-operators. Furthermore, proposed decouple...

10.14778/3611479.3611539 article EN Proceedings of the VLDB Endowment 2023-07-01

A four-dimensional analysis of partitioned approximate filters

OPENALEX - Publications

Tobias Schmidt Maximilian Bandle Jana Giceva

With today's data deluge, approximate filters are particularly attractive to avoid expensive operations like remote data/disk accesses. Among the many filter variants available, it is non-trivial find most suitable one and its optimal configuration for a specific use-case. We provide open-source implementations relevant (Bloom, Cuckoo, Morton, Xor filters) compare them in four key dimensions: false-positive rate, space consumption, build, lookup throughput. improve upon existing...

10.14778/3476249.3476286 article EN Proceedings of the VLDB Endowment 2021-07-01

GraphMatch: Subgraph Query Processing on FPGAs

OPENALEX - Publications

Jonas Dann Tobias Götz Daniel Ritter Jana Giceva Holger Fröning

Efficiently finding subgraph embeddings in large graphs is crucial for many application areas like biology and social network analysis. Set intersections are the predominant most challenging aspect of current join-based query processing systems CPUs. Previous work has shown viability utilizing FPGAs acceleration graph join processing. In this work, we propose GraphMatch, first genearl-purpose stand-alone accelerator based on worst-case optimal joins (WCOJ) that fully designed modern, field...

10.48550/arxiv.2402.17559 preprint EN arXiv (Cornell University) 2024-02-27

Coming Soon ...