NFDI4DS | UHH-SEMS - Publication Details

Steffen Zeuch

ORCID: 0000-0002-4082-7788

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5051995830

Research Areas

Cloud Computing and Resource Management
Advanced Database Systems and Queries
IoT and Edge/Fog Computing
Distributed systems and fault tolerance
Parallel Computing and Optimization Techniques
Advanced Data Storage Technologies
Energy Efficient Wireless Sensor Networks
Data Management and Algorithms
Distributed and Parallel Computing Systems
Graph Theory and Algorithms
Software System Performance and Reliability
Caching and Content Delivery
Peer-to-Peer Network Technologies
Data Stream Mining Techniques
Algorithms and Data Compression
Stochastic Gradient Optimization Techniques
Advanced Image and Video Retrieval Techniques
Advanced Clustering Algorithms Research
Context-Aware Activity Recognition Systems
Blockchain Technology Applications and Security
Privacy-Preserving Technologies in Data
Indoor and Outdoor Localization Technologies
Web Data Mining and Analysis
Network Packet Processing and Optimization
Machine Learning and Data Classification

Technische Universität Berlin
2019-2024

German Research Centre for Artificial Intelligence
2018-2022

Humboldt-Universität zu Berlin
2014-2017

Analyzing efficient stream processing on modern hardware

OPENALEX - Publications

Steffen Zeuch Bonaventura Del Monte Jeyhun Karimov Clemens Lutz M. Renz and 4 more

Modern Stream Processing Engines (SPEs) process large data volumes under tight latency constraints. Many SPEs execute processing pipelines using message passing on shared-nothing architectures and apply a partition-based scale-out strategy to handle high-velocity input streams. Furthermore, many state-of-the-art rely Java Virtual Machine achieve platform independence speed up system development by abstracting from the underlying hardware. In this paper, we show that taking hardware into...

10.14778/3303753.3303758 article EN Proceedings of the VLDB Endowment 2019-01-01

Pump Up the Volume

OPENALEX - Publications

Clemens Lutz Sebastian Breß Steffen Zeuch Tilmann Rabl Volker Markl

GPUs have long been discussed as accelerators for database query processing because of their high power and memory bandwidth. However, two main challenges limit the utility large-scale data processing: (1) on-board capacity is too small to store large sets, yet (2) interconnect bandwidth CPU main-memory insufficient ad hoc transfers. As a result, GPU-based systems algorithms run into transfer bottleneck do not scale sets. In practice, CPUs process faster than with current technology. this...

10.1145/3318464.3389705 article EN 2020-05-29

Generating custom code for efficient query execution on heterogeneous processors

OPENALEX - Publications

Sebastian Breß Bastian Köcher Henning Funke Steffen Zeuch Tilmann Rabl and 1 more

10.1007/s00778-018-0512-y article EN The VLDB Journal 2018-07-09

Rhino: Efficient Management of Very Large Distributed State for Stream Processing Engines

OPENALEX - Publications

Bonaventura Del Monte Steffen Zeuch Tilmann Rabl Volker Markl

Scale-out stream processing engines (SPEs) are powering large big data applications on high velocity streams. Industrial setups require SPEs to sustain outages, varying rates, and low-latency processing. need transparently reconfigure stateful queries during runtime. However, state-of-the-art not ready yet handle on-the-fly reconfigurations of with terabytes state due three problems. These network overhead for migration, consistency, In this paper, we propose Rhino, a library efficient...

10.1145/3318464.3389723 article EN 2020-05-29

Grizzly: Efficient Stream Processing Through Adaptive Query Compilation

OPENALEX - Publications

Philipp M. Grulich Sebastian Breß Steffen Zeuch Jonas Traub Janis von Bleichert and 3 more

Stream Processing Engines (SPEs) execute long-running queries on unbounded data streams. They follow an interpretation-based processing model and do not perform runtime optimizations. This limits the utilization of modern hardware neglects changing characteristics at runtime. In this paper, we present Grizzly, a novel adaptive query compilation-based SPE, to enable highly efficient execution. We extend compilation task-based parallelization for unique requirements stream apply...

10.1145/3318464.3389739 article EN 2020-05-29

Showcasing Data Management Challenges for Future IoT Applications with NebulaStream

OPENALEX - Publications

Aljoscha P. Lepping Hoang Pham Laura Mons Balint Rueb Philipp M. Grulich and 3 more

Data management systems will face several new challenges in supporting IoT applications during the coming years. These arise from managing large numbers of heterogeneous devices and require combining elastic cloud fog resources unified fog-cloud environments. In this demonstration, we introduce a smart city simulation called IoTropolis use it to create interactive eHealth Smart Grid application scenarios. We these scenarios showcase three key Furthermore, demonstrate how our recently...

10.14778/3611540.3611588 article EN Proceedings of the VLDB Endowment 2023-08-01

The NebulaStream Platform: Data and Application Management for the Internet of Things

OPENALEX - Publications

Steffen Zeuch Ankit Chaudhary Bonaventura Del Monte Haralampos Gavriilidis Dimitrios Giouroukis and 4 more

The Internet of Things (IoT) presents a novel computing architecture for data management: distributed, highly dynamic, and heterogeneous environment massive scale. Applications the IoT introduce new challenges integrating concepts fog cloud as well sensor networks in one unified environment. In this paper, we highlight these major outline how existing systems handle them. To address challenges, NebulaStream platform, general purpose, endto-end management system IoT. addresses heterogeneity...

10.48550/arxiv.1910.07867 preprint EN other-oa arXiv (Cornell University) 2019-01-01

A survey of adaptive sampling and filtering algorithms for the internet of things

OPENALEX - Publications

Dimitrios Giouroukis Alexander Dadiani Jonas Traub Steffen Zeuch Volker Markl

The Internet of Things (IoT) represents one the fastest emerging trends in area information and communication technology. main challenge IoT is timely gathering data streams from potentially millions sensors. In particular, those sensors are widely distributed, constantly transit, highly heterogeneous, unreliable. To gather such a dynamic environment efficiently, two techniques have emerged over last decade: adaptive sampling filtering. These dynamically reconfigure rates filter thresholds...

10.1145/3401025.3403777 preprint EN 2020-07-13

Efficient Placement of Decomposable Aggregation Functions for Stream Processing over Large Geo-Distributed Topologies

OPENALEX - Publications

Xenofon Chatziliadis Eleni Tzirita Zacharatou Alphan Eracar Steffen Zeuch Volker Markl

A recent trend in stream processing is offloading the computation of decomposable aggregation functions (DAF) from cloud nodes to geo-distributed fog/edge devices decrease latency and improve energy efficiency. However, deploying DAFs on low-end challenging due their volatility limited resources. Additionally, environments, creating new operator instances demand replicating operators ubiquitously restricted, posing challenges for achieving load balancing without overloading devices. Existing...

10.14778/3648160.3648186 article EN Proceedings of the VLDB Endowment 2024-02-01

Babelfish

OPENALEX - Publications

Philipp M. Grulich Steffen Zeuch Volker Markl

Today's users of data processing systems come from different domains, have levels expertise, and prefer programming languages. As a result, analytical workload requirements shifted relational to polyglot queries involving user-defined functions (UDFs). Although some support queries, they often embed third-party language runtimes. This embedding induces high performance overhead, as it causes additional materialization between execution engines. In this paper, we present Babelfish, novel...

10.14778/3489496.3489501 article EN Proceedings of the VLDB Endowment 2021-10-01

Triton Join: Efficiently Scaling to a Large Join State on GPUs with Fast Interconnects

OPENALEX - Publications

Clemens Lutz Sebastian Breß Steffen Zeuch Tilmann Rabl Volker Markl

Database management systems are facing growing data volumes. Previous research suggests that GPUs well-equipped to quickly process joins and similar stateful operators, as feature high-bandwidth on-board memory. However, cannot scale large volumes due two limiting factors: (1)~large state does not fit into the memory, (2)~spilling main memory is constrained by interconnect bandwidth. Thus, CPUs often better choice for scalable processing.

10.1145/3514221.3517911 article EN Proceedings of the 2022 International Conference on Management of Data 2022-06-10

Parallelizing Intra-Window Join on Multicores

OPENALEX - Publications

Shuhao Zhang Yancan Mao Jiong He Philipp M. Grulich Steffen Zeuch and 3 more

The intra-window join (IaWJ), i.e., joining two input streams over a single window, is core operation in modern stream processing applications. This paper presents the first comprehensive study on parallelizing IaWJ multicore architectures. In particular, we classify algorithms into lazy and eager execution approaches. For each approach, there are further design aspects to consider, including different methods partitioning schemes, leading large space. Our results show that none of always...

10.1145/3448016.3452793 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Non-invasive progressive optimization for in-memory databases

OPENALEX - Publications

Steffen Zeuch Holger Pirk Johann-Christoph Freytag

Progressive optimization introduces robustness for database workloads against wrong estimates, skewed data, correlated attributes, or outdated statistics. Previous work focuses on cardinality estimates and rely expensive counting methods as well complex learning algorithms. In this paper, we utilize performance counters to drive progressive during query execution. The main advantages are that introduce virtually no costs modern CPUs their usage enables a non-invasive monitoring. We present...

10.14778/3007328.3007332 article EN Proceedings of the VLDB Endowment 2016-10-01

An Energy-Efficient Stream Join for the Internet of Things

OPENALEX - Publications

Adrian Michalke Philipp M. Grulich Clemens Lutz Steffen Zeuch Volker Markl

The Internet of Things (IoT) combines large data centers with (mobile, networked) edge devices that are constrained both in compute power and energy budget. Modern contribute to query processing by leveraging accelerated units multicore CPUs or GPUs. Therefore, the IoT presents challenges 1) minimizing consumed while sustaining a given throughput, 2) increasingly complex queries within

10.1145/3465998.3466005 article EN 2021-06-18

ExDRa: Exploratory Data Science on Federated Raw Data

OPENALEX - Publications

Sebastian Baunsgaard Matthias Böehm Ankit Chaudhary Behrouz Derakhshan S. Geißelsöder and 12 more

Data science workflows are largely exploratory, dealing with under-specified objectives, open-ended problems, and unknown business value. Therefore, little investment is made in systematic acquisition, integration, pre-processing of data. This lack infrastructure results redundant manual effort computation. Furthermore, central data consolidation not always technically or economically desirable even feasible (e.g., due to privacy, and/or ownership). The ExDRa system aims provide for this...

10.1145/3448016.3457549 article EN Proceedings of the 2022 International Conference on Management of Data 2021-06-09

Efficient and Scalable k‑Means on GPUs

OPENALEX - Publications

Clemens Lutz Sebastian Breß Tilmann Rabl Steffen Zeuch Volker Markl

10.1007/s13222-018-0293-x article EN Datenbank-Spektrum 2018-09-06

POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least Resistance

OPENALEX - Publications

David Justen Daniel P. Ritter Campbell Fraser Andrew Lamb Allison Lee and 5 more

Join ordering and query optimization are crucial for performance but remain challenging due to unknown or changing characteristics of intermediates, especially complex queries with many joins. Over the past two decades, a spectrum techniques adaptive processing (AQP)---including inter-/intra-operator adaptivity tuple routing---have been proposed address these challenges. However, commercial database systems in practice do not implement holistic AQP because they increase system complexity...

10.14778/3648160.3648175 article EN Proceedings of the VLDB Endowment 2024-02-01

Query Compilation Without Regrets

OPENALEX - Publications

Philipp M. Grulich Aljoscha P. Lepping Dwi Nugroho Varun Pandey Bonaventura Del Monte and 2 more

Engineering high-performance query execution engines is a challenging task. Query compilation provides excellent performance, but at the same time introduces significant system complexity, as it makes engine hard to build, debug, and maintain. To overcome this we propose Nautilus, framework that combines ease of use interpretation performance compilation. On one hand, Nautilus an interpretation-based operator interface enables engineers implement operators using imperative C++ code ensure...

10.1145/3654968 article EN other-oa Proceedings of the ACM on Management of Data 2024-05-29

Selection on Modern CPUs

OPENALEX - Publications

Steffen Zeuch Johann-Christoph Freytag

Modern processors employ sophisticated techniques such as speculative or out-of-order execution to hide memory latencies and keep their pipelines fully utilized. However, these introduce high complexity variance query processing. In particular, are transparent DBMS operations since they managed by internally. To utilize the capabilities of modern CPUs, it is necessary understand characteristics adjust operators well cost models accordingly.

10.1145/2803140.2803145 article EN 2015-08-26

Rethinking Stateful Stream Processing with RDMA

OPENALEX - Publications

Bonaventura Del Monte Steffen Zeuch Tilmann Rabl Volker Markl

Remote Direct Memory Access (RDMA) hardware has bridged the gap between network and main memory speed thus invalidated common assumption that is often bottleneck in distributed data processing systems. However, high-speed networks do not provide "plug-and-play" performance (e.g., using IP-over- InfiniBand) require a careful co-design of system application logic. As result, designers need to rethink architecture their management systems benefit from RDMA acceleration. In this paper, we focus...

10.1145/3514221.3517826 article EN Proceedings of the 2022 International Conference on Management of Data 2022-06-10

Coming Soon ...