Lauritz Thamsen

ORCID: 0000-0003-3755-1503
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Cloud Computing and Resource Management
  • IoT and Edge/Fog Computing
  • Distributed and Parallel Computing Systems
  • Scientific Computing and Data Management
  • Data Stream Mining Techniques
  • Advanced Data Storage Technologies
  • Software System Performance and Reliability
  • Distributed systems and fault tolerance
  • Parallel Computing and Optimization Techniques
  • Water Systems and Optimization
  • Real-Time Systems Scheduling
  • Energy Efficient Wireless Sensor Networks
  • Network Time Synchronization Technologies
  • Embedded Systems Design Techniques
  • Green IT and Sustainability
  • Air Quality Monitoring and Forecasting
  • Traffic Prediction and Management Techniques
  • Advanced Database Systems and Queries
  • Software-Defined Networks and 5G
  • Digital Media Forensic Detection
  • Data Visualization and Analytics
  • Vehicular Ad Hoc Networks (VANETs)
  • Artificial Intelligence in Healthcare
  • Context-Aware Activity Recognition Systems
  • Caching and Content Delivery

University of Glasgow
2022-2024

Technische Universität Berlin
2015-2022

Humboldt-Universität zu Berlin
2021-2022

Vanderbilt University
2021

Hasso Plattner Institute
2012-2015

University of Potsdam
2014-2015

Depending on energy sources and demand, the carbon intensity of public power grid fluctuates over time. Exploiting this variability is an important factor in reducing emissions caused by data centers. However, regional differences availability low-carbon make it hard to provide general best practices for when consume electricity. Moreover, existing research domain focuses mostly carbon-aware workload migration across geo-distributed centers, or addresses demand response purely from...

10.1145/3464298.3493399 preprint EN 2021-12-02

Federated Learning (FL) is an emerging machine learning technique that enables distributed model training across data silos or edge devices without sharing. Yet, FL inevitably introduces inefficiencies compared to centralized training, which will further increase the already high energy usage and associated carbon emissions of in future. One idea reduce FL's footprint schedule jobs based on availability renewable excess can occur at certain times places grid. However, presence such volatile...

10.1145/3632775.3639589 preprint EN 2024-02-19

The appeal of MapReduce has spawned a family systems that implement or extend it. In order to enable parallel collection processing with User-Defined Functions (UDFs), these expose extensions the programming model as library-based dataflow APIs are tightly coupled their underlying runtime engine. Expressing data analysis algorithms complex and control flow structure using such reveals number limitations impede programmer's productivity.

10.1145/2723372.2750543 article EN 2015-05-27

The Function-as-a-Service (FaaS) paradigm has a lot of potential as computing model for fog environments comprising both cloud and edge nodes, compute requests can be scheduled across the entire continuum in fine-grained manner. When request rate exceeds capacity limits at resource-constrained edge, some functions need to offloaded towards cloud. In this paper, we present an auction-inspired approach which application developers bid on resources while nodes decide locally execute offload...

10.1002/spe.3058 article EN Software Practice and Experience 2021-12-06

Despite constant improvements in efficiency, today's data centers and networks consume enormous amounts of energy this demand is expected to rise even further. An important research question whether how fog computing can curb trend. As real-life deployments infrastructure are still rare, a significant part relies on simulations. However, existing power models usually only target particular components such as compute nodes or battery-constrained edge devices.Combining analytical...

10.1109/icfec51620.2021.00012 article EN 2021-05-01

Distributed dataflow systems like Spark or Flink enable users to analyze large datasets. Users create programs by providing sequential user-defined functions for a set of well-defined operations, select resources, and the automatically distribute jobs across these resources. However, selecting resources specific performance needs is inherently difficult consequently tend overprovision, which results in poor cluster utilization. At same time, many important are executed recurringly production...

10.1109/pccc.2016.7820629 article EN 2016-12-01

Scientific workflow management systems like Nextflow support large-scale data analysis by abstracting away the details of scientific workflows. In these systems, workflows consist several abstract tasks, which instances are run in parallel and transform input partitions into output partitions. Resource managers Kubernetes execute such tasks on cluster infrastructures. However, resource only consider number CPUs amount available memory when assigning to resources; they do not hardware...

10.1109/bigdata52589.2021.9671519 article EN 2021 IEEE International Conference on Big Data (Big Data) 2021-12-15

The Internet of Things describes a network physical devices interacting and producing vast streams sensor data. At present there are number general challenges which exist while developing solutions for use cases involving the monitoring control urban infrastructures. These include need dependable method extracting value from these high volume time sensitive data is adaptive to changing workloads. Low-latency access current state live necessity as well ability perform queries on historical...

10.1109/ic2e52221.2021.00041 article EN 2021-10-01

In this paper we introduce our vision of a Cognitive Computing Continuum to address the changing IT service provisioning towards distributed, opportunistic, self-managed collaboration between heterogeneous devices outside traditional data centre boundaries. The focal point continuum are cognitive devices, which have make decisions autonomously using their on-board computation and storage capacity based on information sensed from environment. Such moving cannot rely fixed infrastructure...

10.1109/ccgrid51090.2021.00076 article EN 2021-05-01

Ever since the commercial offerings of Cloud started appearing in 2006, landscape cloud computing has been undergoing remarkable changes with emergence many different types service offerings, developer productivity enhancement tools, and new application classes as well manifestation functionality closer to user at edge. The notion utility computing, however, remained constant throughout its evolution, which means that users always seek save costs leasing resources while maximizing their use....

10.1109/ic2e52221.2021.00044 article EN 2021-10-01

Many scientific workflow scheduling algorithms need to be informed about task runtimes a-priori conduct efficient scheduling. In heterogeneous cluster infrastructures, this problem becomes aggravated because these are required for each task-node pair. Using historical data is often not feasible as logs typically retained indefinitely and workloads well infrastructure changes. contrast, online methods, which predict on specific nodes while the running, have cope with lack of example runs,...

10.1145/3538712.3538739 preprint EN 2022-07-06

Distributed dataflow systems enable the use of clusters for scalable data analytics. However, selecting appropriate cluster resources a processing job is often not straightforward. Performance models trained on historical executions concrete are helpful in such situations, yet they usually bound to specific execution context (e.g. node type, software versions, parameters) due few considered input parameters. Even case slight changes, supportive need be retrained and cannot benefit from...

10.1109/cluster48925.2021.00052 preprint EN 2021-09-01

Distributed dataflow systems enable data-parallel processing of large datasets on clusters. Public cloud providers offer a variety and quantity resources that can be used for such Yet, selecting appropriate jobs - neither lead to bottlenecks nor low resource utilization is often challenging, even expert users as data engineers. We present C3O, collaborative system optimizing cluster configurations in public clouds based shared historical runtime data. The utilized predicting the runtimes...

10.1109/ic2e52221.2021.00018 preprint EN 2021-10-01

Scientific workflow management systems (SWMSs) and resource managers together ensure that tasks are scheduled on provisioned resources so all dependencies obeyed, some optimization goal, such as makespan minimization, is achieved. In practice, however, there no clear separation of scheduling responsibilities between an SWMS a manager because exists agreed-upon concerns their different components. This has two consequences. First, the lack standardized API to exchange information SWMSs...

10.1109/ccgrid57682.2023.00025 preprint EN 2023-05-01

Low-latency processing of data streams from distributed sensors is becoming increasingly important for a growing number IoT applications. In these environments sensor collected at the edge network typically transmitted in hops: devices to intermediate resources clusters cloud resources. Scheduling tasks dataflow jobs on all can significantly reduce application latencies and congestion. However, this schedulers need take heterogeneity topologies into account.This paper examines multiple...

10.1109/bigdata.2018.8622651 article EN 2021 IEEE International Conference on Big Data (Big Data) 2018-12-01

Operation and maintenance of large distributed cloud applications can quickly become unmanageably complex, putting human operators under immense stress when problems occur. Utilizing machine learning for identification localization anomalies in such systems supports experts enables fast mitigation. However, due to the various inter-dependencies system components, do not only affect their origin but propagate through system. Taking this into account, we present Arvalus its variant D-Arvalus,...

10.1109/cloudintelligence52565.2021.00011 preprint EN 2021-05-01

Scientific workflows typically comprise a multitude of different processing steps which often are executed in parallel on partitions the input data. These executions, turn, must be scheduled compute nodes computational infrastructure at hand. This assignment is complicated by facts that (a) tasks have highly heterogeneous resource requirements and (b) many infrastructures, offer resources. In consequence, predictions runtime given task node, as required scheduling algorithms, rather...

10.1109/ipccc55026.2022.9894299 preprint EN 2022-10-12

As a result of the many technical advances in microcomputers and mobile connectivity, Internet Things (IoT) has been on rise recent decade. Due to broad spectrum applications, networks facilitating IoT scenarios can be very different scale complexity. Additionally, connected devices are uncommonly heterogeneous, including micro controllers, smartphones, fog nodes server infrastructures. Therefore, testing applications is difficult, motivating adequate tool support.

10.1145/3368235.3368832 article EN 2019-12-02

Embedded systems have been used to control physical environments for decades. Usually, such use cases require low latencies between commands and actions as well a high predictability of the expected worst-case delay. To achieve this on small, low-powered microcontrollers, Real-Time Operating Systems (RTOSs) are manage different tasks these machines deterministically possible. However, with advent Internet Things (IoT) in industrial applications, same embedded now equipped networking...

10.1109/ipccc50635.2020.9391536 preprint EN 2020-11-06

Analyzing large datasets with distributed dataflow systems requires the use of clusters. Public cloud providers offer a variety and quantity resources that can be used for such However, picking appropriate in both type number often challenging, as selected configuration needs to match job's resource demands access patterns. A good cluster avoids hardware bottlenecks maximizes utilization, avoiding costly overprovisioning. We propose collaborative approach finding optimal configurations based...

10.1109/bigdata50022.2020.9377994 article EN 2021 IEEE International Conference on Big Data (Big Data) 2020-12-10

Internet of Things (IoT) applications promise to make many aspects our lives more efficient and adaptive through the use distributed sensing computing nodes. A central aspect such is their complex communication behavior that heavily influenced by physical environment system. To continuously improve IoT applications, a staging needed can provide operating conditions representative deployments in actual production environments - similar what common practice cloud application development today....

10.1109/percomworkshops51409.2021.9431087 article EN 2022 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops) 2021-03-22
Coming Soon ...