NFDI4DS | UHH-SEMS - Publication Details

How good are query optimizers, really?

OPENALEX - Publications

Viktor Leis Andrey Gubichev Atanas Mirchev Peter Boncz Alfons Kemper and 1 more

Finding a good join order is crucial for query performance. In this paper, we introduce the Join Order Benchmark (JOB) and experimentally revisit main components in classic optimizer architecture using complex, real-world data set realistic multi-join queries. We investigate quality of industrial-strength cardinality estimators find that all routinely produce large errors. further show while estimates are essential finding order, performance unsatisfactory if engine relies too heavily on...

10.14778/2850583.2850594 article EN Proceedings of the VLDB Endowment 2015-11-01

The adaptive radix tree: ARTful indexing for main-memory databases

OPENALEX - Publications

Viktor Leis Alfons Kemper Thomas Neumann

Main memory capacities have grown up to a point where most databases fit into RAM. For main-memory database systems, index structure performance is critical bottleneck. Traditional in-memory data structures like balanced binary search trees are not efficient on modern hardware, because they do optimally utilize on-CPU caches. Hash tables, also often used for indexes, fast but only support queries. To overcome these shortcomings, we present ART, an adaptive radix tree (trie) indexing in main...

10.1109/icde.2013.6544812 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2013-04-01

Morsel-driven parallelism

OPENALEX - Publications

Viktor Leis Peter Boncz Alfons Kemper Thomas Neumann

With modern computer architecture evolving, two problems conspire against the state-of-the-art approaches in parallel query execution: (i) to take advantage of many-cores, all work must be distributed evenly among (soon) hundreds threads order achieve good speedup, yet (ii) dividing is difficult even with accurate data statistics due complexity out-of-order cores. As a result, existing for plan-driven parallelism run into load balancing and context-switching bottlenecks, therefore no longer...

10.1145/2588555.2610507 article EN 2014-06-18

Learned Cardinalities: Estimating Correlated Joins with Deep Learning

OPENALEX - Publications

Andreas Kipf Thomas Kipf Bernhard Radke Viktor Leis Peter Boncz and 1 more

We describe a new deep learning approach to cardinality estimation. MSCN is multi-set convolutional network, tailored representing relational query plans, that employs set semantics capture features and true cardinalities. builds on sampling-based estimation, addressing its weaknesses when no sampled tuples qualify predicate, in capturing join-crossing correlations. Our evaluation of using real-world dataset shows significantly enhances the quality which core problem optimization.

10.48550/arxiv.1809.00677 preprint EN other-oa arXiv (Cornell University) 2018-01-01

SuRF

OPENALEX - Publications

Huanchen Zhang Hyeontaek Lim Viktor Leis David G. Andersen Michael Kaminsky and 2 more

We present the Succinct Range Filter (SuRF), a fast and compact data structure for approximate membership tests. Unlike traditional Bloom filters, SuRF supports both single-key lookups common range queries: open-range queries, closed-range counts. is based on new called Fast Trie (FST) that matches point query performance of state-of-the-art order-preserving indexes, while consuming only 10 bits per trie node. The false positive rates in queries are tunable to satisfy different application...

10.1145/3183713.3196931 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

Query optimization through the looking glass, and what we found running the Join Order Benchmark

OPENALEX - Publications

Viktor Leis Bernhard Radke Andrey Gubichev Atanas Mirchev Peter Boncz and 2 more

10.1007/s00778-017-0480-7 article EN The VLDB Journal 2017-09-18

Managing Non-Volatile Memory in Database Systems

OPENALEX - Publications

Alexander van Renen Viktor Leis Alfons Kemper Thomas Neumann Takushi Hashida and 4 more

Non-volatile memory (NVM) is a new storage technology that combines the performance and byte addressability of DRAM with persistence traditional devices like flash (SSD). While these properties make NVM highly promising, it not yet clear how to best integrate into layer modern database systems. Two system designs have been proposed. The first use exclusively, i.e., store all data index structures on it. However, because has higher latency than DRAM, this design can be less efficient...

10.1145/3183713.3196897 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

Exploiting hardware transactional memory in main-memory databases

OPENALEX - Publications

Viktor Leis Alfons Kemper Thomas Neumann

So far, transactional memory-although a promising technique-suffered from the absence of an efficient hardware implementation. The upcoming Haswell microarchitecture Intel introduces memory (HTM) in mainstream CPUs. HTM allows for concurrent, atomic operations, which is also highly desirable context databases. On other hand has several limitations that, general, prevent one-to-one mapping database transactions to transactions. In this work we devise building blocks that can be used exploit...

10.1109/icde.2014.6816683 article EN 2014-03-01

The ART of practical synchronization

OPENALEX - Publications

Viktor Leis Florian Scheibner Alfons Kemper Thomas Neumann

The performance of transactional database systems is critically dependent on the efficient synchronization in-memory data structures. traditional approach, fine-grained locking, does not scale modern hardware. Lock-free structures, in contrast, very well but are extremely difficult to implement and often require additional indirections. In this work, we argue for a middle ground, i.e., protocols that use only sparingly. We synchronize Adaptive Radix Tree (ART) using two such protocols,...

10.1145/2933349.2933352 article EN 2016-06-01

HOT

OPENALEX - Publications

Robert Binna Eva Zangerle Martin Pichl Günther Specht Viktor Leis

We present the Height Optimized Trie (HOT), a fast and space-efficient in-memory index structure. The core algorithmic idea of HOT is to dynamically vary number bits considered at each node, which enables consistently high fanout thereby good cache efficiency. layout node carefully engineered for compactness search using SIMD instructions. Our experimental results, use wide variety workloads data sets, show that outperforms other state-of-the-art structures string keys both in terms...

10.1145/3183713.3196896 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

LeanStore: In-Memory Data Management beyond Main Memory

OPENALEX - Publications

Viktor Leis Michael Haubenschild Alfons Kemper Thomas Neumann

Disk-based database systems use buffer managers in order to transparently manage data sets larger than main memory. This traditional approach is effective at minimizing the number of I/O operations, but also major source overhead comparison with in-memory systems. To avoid this overhead, therefore abandon management altogether, which makes handling memory very difficult. In work, we revisit fundamental dichotomy and design a novel storage manager that optimized for modern hardware. Our...

10.1109/icde.2018.00026 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2018-04-01

Building a Bw-Tree Takes More Than Just Buzz Words

OPENALEX - Publications

Ziqi Wang Andrew Pavlo Hyeontaek Lim Viktor Leis Huanchen Zhang and 2 more

In 2013, Microsoft Research proposed the Bw-Tree (humorously termed "Buzz Word Tree''), a lock-free index that provides high throughput for transactional database workloads in SQL Server's Hekaton engine. The Buzz Tree avoids locks by appending delta record to tree nodes and using an indirection layer allows it atomically update physical pointers compare-and-swap (CaS). Correctly implementing this techniques requires careful attention detail. Unfortunately, papers from are missing important...

10.1145/3183713.3196895 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

Persistent Memory I/O Primitives

OPENALEX - Publications

Alexander van Renen Lukas Vogel Viktor Leis Thomas Neumann Alfons Kemper

I/O latency and throughput is one of the major performance bottlenecks for disk-based database systems. Upcoming persistent memory (PMem) technologies, like Intel's Optane DC Persistent Memory Modules, promise to bridge gap between NAND-based flash (SSD) DRAM, thus eliminate bottleneck. In this paper, we provide first evaluations PMem in terms bandwidth latency. Based on results, develop guidelines efficient usage two essential primitives tuned PMem: log writing block flushing.

10.1145/3329785.3329930 article EN 2019-06-24

What Modern NVMe Storage Can Do, and How to Exploit it: High-Performance I/O for High-Performance Storage Engines

OPENALEX - Publications

Gabriel P. Haas Viktor Leis

NVMe SSDs based on flash are cheap and offer high throughput. Combining several of these devices into a single server enables 10 million I/O operations per second or more. Our experiments show that existing out-of-memory database systems storage engines achieve only fraction this performance. In work, we demonstrate it is possible to close the performance gap between hardware software through an optimized engine design. heavy setting, where dataset times larger than main memory, our system...

10.14778/3598581.3598584 article EN Proceedings of the VLDB Endowment 2023-05-01

Adaptive Execution of Compiled Queries

OPENALEX - Publications

André Kohn Viktor Leis Thomas Neumann

Compiling queries to machine code is a very efficient way for executing queries. One often overlooked problem with compilation the time it takes generate code. Even fast frameworks like LLVM, generating complex hundreds of milliseconds. Such durations can be major disadvantage workloads that execute many complex, but quick To solve this problem, we propose an adaptive execution framework, which dynamically switches from interpretation compilation. We also bytecode interpreter without costly...

10.1109/icde.2018.00027 article EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2018-04-01

Get Real

OPENALEX - Publications

Adrian Vogelsgesang Michael Haubenschild Jan Finis Alfons Kemper Viktor Leis and 3 more

Industrial as well academic analytics systems are usually evaluated based on well-known standard benchmarks, such TPC-H or TPC-DS. These benchmarks test various components of the DBMS including join optimizer, implementation and aggregation operators, concurrency control scheduler. However, these fall short evaluating "real" challenges imposed by modern BI systems, Tableau, that emit machine-generated query workloads. This paper reports a comprehensive study set more than 60k real-world data...

10.1145/3209950.3209952 article EN 2018-06-05

Everything you always wanted to know about compiled and vectorized queries but were afraid to ask

OPENALEX - Publications

Timo Kersten Viktor Leis Alfons Kemper Thomas Neumann Andrew Pavlo and 1 more

The query engines of most modern database systems are either based on vectorization or data-centric code generation. These two state-of-the-art processing paradigms fundamentally different in terms system structure and execution code. Both were used to build fast systems. However, until today it is not clear which paradigm yields faster execution, as many implementation-specific choices obstruct a direct comparison architectures. In this paper, we experimentally compare the models by...

10.5555/3275366.3284966 article EN Very Large Data Bases 2018-09-01

ScaleStore: A Fast and Cost-Efficient Storage Engine using DRAM, NVMe, and RDMA

OPENALEX - Publications

Tobias Ziegler Carsten Binnig Viktor Leis

In this paper, we propose ScaleStore, a novel distributed storage engine that exploits DRAM caching, NVMe storage, and RDMA networking to achieve high performance, cost-efficiency, scalability at the same time. Using low latency messages, ScaleStore implements transparent memory abstraction provides access aggregated of all nodes. contrast existing designs such as NAM-DB or FaRM, stores cold data on SSDs (flash), lowering overall hardware cost significantly. The core is caching strategy...

10.1145/3514221.3526187 article EN Proceedings of the 2022 International Conference on Management of Data 2022-06-10

Virtual-Memory Assisted Buffer Management

OPENALEX - Publications

Viktor Leis Adnan Alhomssi Tobias Ziegler Yannick Loeck Christian Dietrich

Most database management systems cache pages from storage in a main memory buffer pool. To do this, they either rely on hash table that translates page identifiers into pointers, or pointer swizzling which avoids this translation. In work, we propose vmcache, manager design instead uses hardware-supported virtual to translate addresses. contrast existing mmap-based approaches, the DBMS retains control over faulting and eviction. Our is portable across modern operating systems, supports...

10.1145/3588687 article EN Proceedings of the ACM on Management of Data 2023-05-26

Design Guidelines for Correct, Efficient, and Scalable Synchronization using One-Sided RDMA

OPENALEX - Publications

Tobias Ziegler Jacob Nelson-Slivon Viktor Leis Carsten Binnig

Remote data structures built with one-sided Direct Memory Access (RDMA) are at the heart of many disaggregated database management systems today. Concurrent access to these by thousands remote workers necessitates a highly efficient synchronization scheme. Remarkably, our investigation reveals that existing schemes display substantial variations in performance and scalability. Even worse, some do not correctly synchronize, resulting rare hard-to-detect corruption. Motivated observations, we...

10.1145/3589276 article EN Proceedings of the ACM on Management of Data 2023-06-13

Practical DB-OS Co-Design with Privileged Kernel Bypass

OPENALEX - Publications

Xinjing Zhou Viktor Leis Jinming Hu Xiangyao Yu Michael Stonebraker

This paper revisits the longstanding challenge of coordinating database systems with general-purpose OS interfaces, such as POSIX, which often lack tailored support for DB requirements. Existing approaches to this DB-OS co-design struggle limited design space, security risks, and compatibility issues. To overcome these hurdles, we propose a new approach leveraging virtualization elevate privilege level processes. Our method enables fully exploit hardware capabilities via virtualization,...

10.1145/3709714 article EN cc-by-nd Proceedings of the ACM on Management of Data 2025-02-10

B-Trees Are Back: Engineering Fast and Pageable Node Layouts

OPENALEX - Publications

M.W. Müller Lawrence Benson Viktor Leis

Large main memory capacity and even larger data sets have motivated hybrid storage systems, which serve most transactions from memory, but can seamlessly transition to flash storage. In such the structure of choice is usually a B-Tree with pageable nodes. Most academic work considers only fixed size records, making them unsuitable for practical applications. Given prevalence B-Trees, surprisingly few available implementations benchmarks optimized B-Trees cover variable-sized records. this...

10.1145/3709664 article EN other-oa Proceedings of the ACM on Management of Data 2025-02-10

Synchronizing Disaggregated Data Structures with One-Sided RDMA: Pitfalls, Experiments and Design Guidelines

OPENALEX - Publications

Matthias Jasny Tobias Ziegler Jacob Nelson-Slivon Viktor Leis Carsten Binnig

Remote data structures built with one-sided Direct Memory Access (RDMA) are at the heart of many disaggregated database management systems today. Concurrent access to these by thousands remote workers necessitates a highly efficient synchronization scheme. Remarkably, our investigation reveals that existing schemes display substantial variations in performance and scalability. Even worse, some do not correctly synchronize, resulting rare hard-to-detect corruption. Motivated observations, we...

10.1145/3716377 article EN ACM Transactions on Database Systems 2025-02-14

ORKG

DBLP

CEUR

MyBinder

Viktor Leis