NFDI4DS | UHH-SEMS - Publication Details

Volker Markl

ORCID: 0009-0009-0964-026X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5002413906

Research Areas

Advanced Database Systems and Queries
Data Management and Algorithms
Cloud Computing and Resource Management
Parallel Computing and Optimization Techniques
Scientific Computing and Data Management
Data Stream Mining Techniques
Advanced Data Storage Technologies
Graph Theory and Algorithms
Data Quality and Management
Distributed systems and fault tolerance
Data Mining Algorithms and Applications
Distributed and Parallel Computing Systems
Machine Learning and Data Classification
Semantic Web and Ontologies
Data Visualization and Analytics
Big Data and Business Intelligence
Algorithms and Data Compression
IoT and Edge/Fog Computing
Software System Performance and Reliability
Peer-to-Peer Network Technologies
Energy Efficient Wireless Sensor Networks
Advanced Image and Video Retrieval Techniques
Caching and Content Delivery
Neural Networks and Applications
Service-Oriented Architecture and Web Services

Technische Universität Berlin
2015-2024

German Research Centre for Artificial Intelligence
2015-2023

Berlin Institute for the Foundations of Learning and Data
2023

Singapore University of Technology and Design
2022

Walter de Gruyter (Germany)
2020

Delft University of Technology
2019

University of Potsdam
2019

German Central Institute for Social Issues
2019

IBM Research - Almaden
2004-2018

DSI Informationstechnik (Germany)
2018

The Stratosphere platform for big data analytics

OPENALEX - Publications

A. Alexandrov Rico Bergmann Stephan Ewen Johann-Christoph Freytag Fabian Hueske and 13 more

10.1007/s00778-014-0357-y article EN The VLDB Journal 2014-05-05

Bigearthnet: A Large-Scale Benchmark Archive for Remote Sensing Image Understanding

OPENALEX - Publications

Gencer Sümbül Marcela Charfuelàn Begüm Demir Volker Markl

This paper presents the BigEarthNet that is a new large-scale multi-label Sentinel-2 benchmark archive. The consists of 590, 326 image patches, each which section i) 120 × pixels for 10m bands; ii) 60×60 20m and iii) 20×20 60m bands. Unlike most existing archives, patch annotated by multiple land-cover classes (i.e., multi-labels) are provided from CORINE Land Cover database year 2018 (CLC 2018). significantly larger than archives in remote sensing (RS) thus much more convenient to be used...

10.1109/igarss.2019.8900532 article EN IGARSS 2022 - 2022 IEEE International Geoscience and Remote Sensing Symposium 2019-07-01

Nephele/PACTs

OPENALEX - Publications

Dominic Battré Stephan Ewen Fabian Hueske Odej Kao Volker Markl and 1 more

We present a parallel data processor centered around programming model of so called Parallelization Contracts (PACTs) and the scalable execution engine Nephele [18]. The PACT is generalization well-known map/reduce model, extending it with further second-order functions, as well Output that give guarantees about behavior function. describe methods to transform program into flow for Nephele, which executes its sequential building blocks in deals communication, synchronization fault tolerance....

10.1145/1807128.1807148 article EN 2010-06-10

CORDS

OPENALEX - Publications

Ihab F. Ilyas Volker Markl Peter J. Haas Paul Brown Ashraf Aboulnaga

The rich dependency structure found in the columns of real-world relational databases can be exploited to great advantage, but also cause query optimizers---which usually assume that are statistically independent---to underestimate selectivities conjunctive predicates by orders magnitude. We introduce CORDS, an efficient and scalable tool for automatic discovery correlations soft functional dependencies between columns. CORDS searches column pairs might have interesting useful relations...

10.1145/1007568.1007641 article EN 2004-06-13

Robust query processing through progressive optimization

OPENALEX - Publications

Volker Markl Vijayshankar Raman David Simmen Guy M. Lohman Hamid Pirahesh and 1 more

Virtually every commercial query optimizer chooses the best plan for a using cost model that relies heavily on accurate cardinality estimation. Cardinality estimation errors can occur due to use of inaccurate statistics, invalid assumptions about attribute independence, parameter markers, and so on. may cause choose sub-optimal plan. We present an approach processing is extremely robust because it able detect recover from errors. call this "progressive optimization" (POP). POP validates...

10.1145/1007568.1007642 article EN 2004-06-13

Hardware-oblivious parallelism for in-memory column-stores

OPENALEX - Publications

Max Heimel Michael Saecker Holger Pirk Stefan Manegold Volker Markl

The multi-core architectures of today's computer systems make parallelism a necessity for performance critical applications. Writing such applications in generic, hardware-oblivious manner is challenging problem: Current database thus rely on labor-intensive and error-prone manual tuning to exploit the full potential modern parallel hardware like CPUs graphics cards. We propose an alternative design engine, based single set operators, which are compiled down actual at runtime. This reduces...

10.14778/2536360.2536370 article EN Proceedings of the VLDB Endowment 2013-07-01

Spinning fast iterative data flows

OPENALEX - Publications

Stephan Ewen Kostas Tzoumas Moritz Kaufmann Volker Markl

Parallel dataflow systems are a central part of most analytic pipelines for big data. The iterative nature many analysis and machine learning algorithms, however, is still challenge current systems. While certain types bulk algorithms supported by novel frameworks, these cannot exploit computational dependencies present in such as graph algorithms. As result, inefficiently executed have led to specialized based on other paradigms, message passing or shared memory. We propose method integrate...

10.14778/2350229.2350245 article EN Proceedings of the VLDB Endowment 2012-07-01

The Beckman report on database research

OPENALEX - Publications

Daniel J. Abadi Rakesh Agrawal Anastasia Ailamaki Magdalena Balazinska Philip A. Bernstein and 25 more

Database researchers paint big data as a defining challenge. To make the most of enormous opportunities at hand will require focusing on five research areas.

10.1145/2845915 article EN Communications of the ACM 2016-01-25

Benchmarking Distributed Stream Data Processing Systems

OPENALEX - Publications

Jeyhun Karimov Tilmann Rabl Asterios Katsifodimos Roman Samarev Henri Heiskanen and 1 more

The need for scalable and efficient stream analysis has led to the development of many open-source streaming data processing systems (SDPSs) with highly diverging capabilities performance characteristics. While first initiatives try compare simple workloads, there is a clear gap detailed analyses systems' In this paper, we propose framework benchmarking distributed engines. We use our suite evaluate three widely used SDPSs in detail, namely Apache Storm, Spark, Flink. Our evaluation focuses...

10.1109/icde.2018.00169 preprint EN 2022 IEEE 38th International Conference on Data Engineering (ICDE) 2018-04-01

Analyzing efficient stream processing on modern hardware

OPENALEX - Publications

Steffen Zeuch Bonaventura Del Monte Jeyhun Karimov Clemens Lutz M. Renz and 4 more

Modern Stream Processing Engines (SPEs) process large data volumes under tight latency constraints. Many SPEs execute processing pipelines using message passing on shared-nothing architectures and apply a partition-based scale-out strategy to handle high-velocity input streams. Furthermore, many state-of-the-art rely Java Virtual Machine achieve platform independence speed up system development by abstracting from the underlying hardware. In this paper, we show that taking hardware into...

10.14778/3303753.3303758 article EN Proceedings of the VLDB Endowment 2019-01-01

Artificial intelligence to advance Earth observation: a perspective

OPENALEX - Publications

Devis Tuia Konrad Schindler Begüm Demir Gustau Camps‐Valls Xiao Xiang Zhu and 10 more

Earth observation (EO) is a prime instrument for monitoring land and ocean processes, studying the dynamics at work, taking pulse of our planet. This article gives bird's eye view essential scientific tools approaches informing supporting transition from raw EO data to usable EO-based information. The promises, as well current challenges these developments, are highlighted under dedicated sections. Specifically, we cover impact (i) Computer vision; (ii) Machine learning; (iii) Advanced...

10.48550/arxiv.2305.08413 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Damia

OPENALEX - Publications

David Simmen Mehmet Altınel Volker Markl Sriram Padmanabhan Ashutosh Singh

Increasingly large numbers of situational applications are being created by enterprise business users as a by-product solving day-to-day problems. In efforts to address the demand for such applications, corporate IT is moving toward Web 2.0 architectures. particular, intranet evolving into platform readily accessible data and services where communities can assemble deploy applications. Damia web style integration developed problem presented which often access combine from variety sources....

10.1145/1376616.1376734 article EN 2008-06-09

OPENALEX - Publications

Uwe Jugel Zbigniew Jerzak Gregor Hackenbroich Volker Markl

Visual analysis of high-volume time series data is ubiquitous in many industries, including finance, banking, and discrete manufacturing. Contemporary, RDBMS-based systems for visualization have difficulty to cope with the hard latency requirements high ingestion rates interactive visualizations. Existing solutions lowering volume disregard semantics visualizations result errors. In this work, we introduce M4, an aggregation-based dimensionality reduction technique that provides error-free...

10.14778/2732951.2732953 article EN Proceedings of the VLDB Endowment 2014-06-01

Self-Tuning, GPU-Accelerated Kernel Density Models for Multidimensional Selectivity Estimation

OPENALEX - Publications

Max Heimel Martin Kiefer Volker Markl

Quickly and accurately estimating the selectivity of multidimensional predicates is a vital part modern relational query optimizer. The state-of-the art in this field are histograms, which offer good estimation quality but complex to construct hard maintain. Kernel Density Estimation (KDE) an interesting alternative that does not suffer from these problems. However, existing KDE-based estimators can hardly compete with methods.

10.1145/2723372.2749438 article EN 2015-05-27

Pipelined Query Processing in Coprocessor Environments

OPENALEX - Publications

Henning Funke Sebastian Breß Stefan Noll Volker Markl Jens Teubner

Query processing on GPU-style coprocessors is severely limited by the movement of data. With teraflops compute throughput in one device, even high-bandwidth memory cannot provision enough data for a reasonable utilization.

10.1145/3183713.3183734 article EN Proceedings of the 2022 International Conference on Management of Data 2018-05-25

A survey of state management in big data processing systems

OPENALEX - Publications

Quoc-Cuong To Juan Soto Volker Markl

10.1007/s00778-018-0514-9 article EN The VLDB Journal 2018-08-02

The Seattle Report on Database Research

OPENALEX - Publications

Daniel J. Abadi Anastasia Ailamaki David F. Andersen Peter Bailis Magdalena Balazinska and 28 more

Approximately every five years, a group of database researchers meet to do self-assessment our community, including reflections on impact the industry as well challenges facing research community. This report summarizes discussion and conclusions 9th such meeting, held during October 9-10, 2018 in Seattle.

10.1145/3385658.3385668 article EN ACM SIGMOD Record 2020-02-25

Estimating join selectivities using bandwidth-optimized kernel density models

OPENALEX - Publications

Martin Kiefer Max Heimel Sebastian Breß Volker Markl

Accurately predicting the cardinality of intermediate plan operations is an essential part any modern relational query optimizer. The accuracy said estimates has a strong and direct impact on quality generated plans, incorrect can have negative performance. One biggest challenges in this field to predict result size join operations. Kernel Density Estimation (KDE) statistical method estimate multivariate probability distributions from data sample. Previously, we introduced modern,...

10.14778/3151106.3151112 article EN Proceedings of the VLDB Endowment 2017-09-01

Pump Up the Volume

OPENALEX - Publications

Clemens Lutz Sebastian Breß Steffen Zeuch Tilmann Rabl Volker Markl

GPUs have long been discussed as accelerators for database query processing because of their high power and memory bandwidth. However, two main challenges limit the utility large-scale data processing: (1) on-board capacity is too small to store large sets, yet (2) interconnect bandwidth CPU main-memory insufficient ad hoc transfers. As a result, GPU-based systems algorithms run into transfer bottleneck do not scale sets. In practice, CPUs process faster than with current technology. this...

10.1145/3318464.3389705 article EN 2020-05-29

LEO: An autonomic query optimizer for DB2

OPENALEX - Publications

Volker Markl Guy M. Lohman Venkat Raman

SQL has emerged as an industry standard for querying relational database management systems, largely because a user need only specify what data is wanted, not the details of how to access that data. A query optimizer uses mathematical model execution determine automatically best way and process any given query. This heavily dependent upon optimizer's estimates number rows will result at each step plan (QEP), especially complex queries involving many predicates and/or operations. These rely...

10.1147/sj.421.0098 article EN IBM Systems Journal 2003-01-01

Coming Soon ...