Morgan K. Geldenhuys

ORCID: 0009-0006-5037-8353
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Cloud Computing and Resource Management
  • Distributed systems and fault tolerance
  • Software System Performance and Reliability
  • Data Stream Mining Techniques
  • Advanced Database Systems and Queries
  • IoT and Edge/Fog Computing
  • Energy Efficient Wireless Sensor Networks
  • Air Quality Monitoring and Forecasting
  • Traffic Prediction and Management Techniques
  • Water Quality Monitoring Technologies
  • Water Systems and Optimization
  • Distributed and Parallel Computing Systems
  • Insect and Arachnid Ecology and Behavior
  • Advanced Data Storage Technologies
  • Network Security and Intrusion Detection
  • Mobile Crowdsensing and Crowdsourcing
  • Insect and Pesticide Research
  • Parallel Computing and Optimization Techniques
  • Anomaly Detection Techniques and Applications
  • Software Engineering Research
  • Data Management and Algorithms
  • Time Series Analysis and Forecasting
  • Advanced Optical Network Technologies
  • Plant and animal studies

Technische Universität Berlin
2019-2025

Abstract This study presents a framework for integrating and qualifying Machine Learning (ML) in Maintenance, Repair, Overhaul (MRO) processes gas turbines. Using neural networks damage detection decision trees repair estimation, it emphasizes continuous qualification aligned with ISO/IEC standards responsible AI principles. An interactive guide supports systematic ML implementation, ensuring transparency compliance Industry 4.0. Validated through two turbine blade case studies, the approach...

10.1515/zwf-2024-0133 article EN cc-by Zeitschrift für wirtschaftlichen Fabrikbetrieb 2025-03-20

The Internet of Things describes a network physical devices interacting and producing vast streams sensor data. At present there are number general challenges which exist while developing solutions for use cases involving the monitoring control urban infrastructures. These include need dependable method extracting value from these high volume time sensitive data is adaptive to changing workloads. Low-latency access current state live necessity as well ability perform queries on historical...

10.1109/ic2e52221.2021.00041 article EN 2021-10-01

Operation and maintenance of large distributed cloud applications can quickly become unmanageably complex, putting human operators under immense stress when problems occur. Utilizing machine learning for identification localization anomalies in such systems supports experts enables fast mitigation. However, due to the various inter-dependencies system components, do not only affect their origin but propagate through system. Taking this into account, we present Arvalus its variant D-Arvalus,...

10.1109/cloudintelligence52565.2021.00011 preprint EN 2021-05-01

The emergence of the Internet Things has seen introduction numerous connected devices used for monitoring and control even Critical Infrastructures. Distributed stream processing become key to analyzing data generated by these improving our ability make decisions. However, optimizing systems towards specific Quality Service targets is a difficult time-consuming task, due large-scale distributed involved, existence so many configuration parameters, inability easily determine impact tuning...

10.1109/bigdata47090.2019.9005504 article EN 2021 IEEE International Conference on Big Data (Big Data) 2019-12-01

To maintain a stable Quality of Service (QoS), these systems require sufficient allocation resources. At the same time, over-provisioning can result in wasted energy and high operating costs. Therefore, to maximize resource utilization, autoscaling methods have been proposed that aim efficiently match with incoming workload. However, determining when by how much scale remains significant challenge. Given long-running nature DSP jobs, scaling actions need be executed at runtime, good QoS,...

10.1145/3629526.3645042 article EN cc-by 2024-05-07

Fault tolerance is a property which needs deeper consideration when dealing with streaming jobs requiring high levels of availability and low-latency processing even in case failures where Quality-of-Service constraints must be adhered to. Typically, systems achieve fault the ability to recover automatically from partial by implementing Checkpoint Rollback Recovery. However, this an expensive operation impacts negatively on overall performance system manually optimizing for specific...

10.1109/bigdata50022.2020.9378474 article EN 2021 IEEE International Conference on Big Data (Big Data) 2020-12-10

Distributed Stream Processing systems have become an essential part of big data processing platforms. They are characterized by the high-throughput near to real-time event streams with goal delivering low-latency results and thus enabling time-sensitive decision making. At same time, expected be consistent even in presence partial failures where exactly-once guarantees required for correctness. workloads oftentimes dynamic nature which makes static configurations highly inefficient as time...

10.1109/icws55610.2022.00041 article EN 2022-07-01

Distributed dataflow systems like Spark and Flink enable the use of clusters for scalable data analytics. While runtime prediction models can be used to initially select appropriate cluster resources given target runtimes, actual performance jobs depends on several factors varies over time. Yet, in many situations, dynamic scaling meet formulated targets despite significant variance.This paper presents Enel, a novel approach that uses message propagation an attributed graph model and, thus,...

10.1109/ipccc51483.2021.9679361 preprint EN 2021-10-29

With weather becoming more extreme both in terms of longer dry periods and severe rain events, municipal water networks are increasingly under pressure. The effects include damages to the pipes, flash floods on streets combined sewer overflows. Retrofitting underground infrastructure is very expensive, thus operators looking deploy IoT solutions that promise alleviate problems at a fraction cost.In this paper, we report preliminary results from an ongoing joint research project, specifically...

10.1109/bigdata50022.2020.9378138 article EN 2021 IEEE International Conference on Big Data (Big Data) 2020-12-10

Distributed Stream Processing systems are becoming an increasingly essential part of Big Data processing platforms as users grow ever more reliant on their ability to provide fast access new results.As such, making timely decisions based these results is dependent a system's tolerate failure.Typically, achieve fault tolerance and the recover automatically from partial failures by implementing checkpoint rollback recovery.However, owing statistical probability occurring in distributed...

10.15439/2022f225 article EN cc-by Annals of Computer Science and Information Systems 2022-09-26

As a canary in coalmine warns of dwindling breathable air, the honeybee can indicate health an ecosystem. Honeybees are most important pollinators fruit-bearing flowers, and share similar ecological niches with many other pollinators; therefore, colony reflect conditions whole The may be mirrored social signals that bees exchange during their sophisticated body movements such as waggle dance. To observe these changes, we developed automatic system records quantifies under normal beekeeping...

10.3389/fnbeh.2021.647224 article EN cc-by Frontiers in Behavioral Neuroscience 2021-04-28

Distributed Stream Processing (DSP) systems enable processing large streams of continuous data to produce results in near real time. They are an essential part many data-intensive applications and analytics platforms. The rate at which events arrive DSP can vary considerably over time, may be due trends, cyclic, seasonal patterns within the streams. A priori knowledge incoming workloads enables proactive approaches resource management optimization tasks such as dynamic scaling, live...

10.1109/ic2e52221.2021.00023 article EN 2021-10-01

Distributed Stream Processing (DSP) focuses on the near real-time processing of large streams unbounded data. To increase capacities, DSP systems are able to dynamically scale across a cluster commodity nodes, ensuring good Quality Service despite variable workloads. However, selecting scaleout configurations which maximize resource utilization remains challenge. This is especially true in environments where workloads change over time and node failures all but inevitable. Furthermore,...

10.48550/arxiv.2403.02129 preprint EN arXiv (Cornell University) 2024-03-04

Distributed Stream Processing (DSP) systems are capable of processing large streams unbounded data, offering high throughput and low latencies. To maintain a stable Quality Service (QoS), these require sufficient allocation resources. At the same time, over-provisioning can result in wasted energy operating costs. Therefore, to maximize resource utilization, autoscaling methods have been proposed that aim efficiently match with incoming workload. However, determining when by how much scale...

10.48550/arxiv.2403.02093 preprint EN arXiv (Cornell University) 2024-03-04

Distributed Stream Processing (DSP) focuses on the near real-time processing of large streams unbounded data. To increase capacities, DSP systems are able to dynamically scale across a cluster commodity nodes, ensuring good Quality Service despite variable workloads. However, selecting scaleout configurations which maximize resource utilization remains challenge. This is especially true in environments where workloads change over time and node failures all but inevitable. Furthermore,...

10.1145/3629526.3645048 article EN cc-by 2024-05-07

Stream processing has become a critical component in the architecture of modern applications. With exponential growth data generation from sources such as Internet Things, business intelligence, and telecommunications, real-time unbounded streams necessity. DSP systems provide solution to this challenge, offering high horizontal scalability, fault-tolerant execution, ability process multiple single job. Often enough though, need be enriched with extra information for correct processing,...

10.48550/arxiv.2307.14287 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Distributed Stream Processing systems have become an essential part of big data processing platforms. They are characterized by the high-throughput near to real-time event streams with goal delivering low-latency results and thus enabling time-sensitive decision making. At same time, expected be consistent even in presence partial failures where exactly-once guarantees required for correctness. workloads oftentimes dynamic nature which makes static configurations highly inefficient as time...

10.48550/arxiv.2206.09679 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Distributed Stream Processing (DSP) systems enable processing large streams of continuous data to produce results in near real time. They are an essential part many data-intensive applications and analytics platforms. The rate at which events arrive DSP can vary considerably over time, may be due trends, cyclic, seasonal patterns within the streams. A priori knowledge incoming workloads enables proactive approaches resource management optimization tasks such as dynamic scaling, live...

10.48550/arxiv.2108.04749 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Stream processing has become a critical component in the architecture of modern applications. With exponential growth data generation from sources such as Internet Things, business intelligence, and telecommunications, real-time unbounded streams necessity. DSP systems provide solution to this challenge, offering high horizontal scalability, fault-tolerant execution, ability process multiple single job. Often enough though, need be enriched with extra information for correct processing,...

10.1109/ic2e59103.2023.00030 article EN 2023-09-25

With weather becoming more extreme both in terms of longer dry periods and severe rain events, municipal water networks are increasingly under pressure. The effects include damages to the pipes, flash floods on streets combined sewer overflows. Retrofitting underground infrastructure is very expensive, thus operators looking deploy IoT solutions that promise alleviate problems at a fraction cost. In this paper, we report preliminary results from an ongoing joint research project,...

10.48550/arxiv.2012.00400 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Fault tolerance is a property which needs deeper consideration when dealing with streaming jobs requiring high levels of availability and low-latency processing even in case failures where Quality-of-Service constraints must be adhered to. Typically, systems achieve fault the ability to recover automatically from partial by implementing Checkpoint Rollback Recovery. However, this an expensive operation impacts negatively on overall performance system manually optimizing for specific...

10.48550/arxiv.2102.06170 preprint EN cc-by arXiv (Cornell University) 2021-01-01

The emergence of the Internet Things has seen introduction numerous connected devices used for monitoring and control even Critical Infrastructures. Distributed stream processing become key to analyzing data generated by these improving our ability make decisions. However, optimizing systems towards specific Quality Service targets is a difficult time-consuming task, due large-scale distributed involved, existence so many configuration parameters, inability easily determine impact tuning...

10.48550/arxiv.2102.06094 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01
Coming Soon ...