NFDI4DS | UHH-SEMS - Publication Details

Minimizing and Managing Cloud Failures

OPENALEX - Publications

Patrícia Takako Endo Guto Leoni Santos Daniel Rosendo Demis Gomes André Moreira and 4 more

Guaranteeing high levels of availability is a huge challenge for cloud providers. The authors look at the causes failures and recommend ways to prevent them minimize their effects when they occur.

10.1109/mc.2017.4041358 article EN Computer 2017-11-01

Analyzing the IT subsystem failure impact on availability of cloud services

OPENALEX - Publications

Guto Leoni Santos Patrícia Takako Endo Glauco Estácio Gonçalves Daniel Rosendo Demis Gomes and 3 more

Cloud computing has gained popularity in recent years due to its pay-as-you-go business model, high availability of services, and scalability. Service unavailability does not affect just user experience but is also translated into direct costs for cloud providers companies. Part this SLA breaches, once interruption time greater than those signed the contract generate financial penalties. Thus, have tried identify failure points estimate their services. This paper proposes models assess...

10.1109/iscc.2017.8024612 article EN 2022 IEEE Symposium on Computers and Communications (ISCC) 2017-07-01

The internet of things for healthcare: optimising e-health system availability in the fog and cloud

OPENALEX - Publications

Guto Leoni Santos Demis Gomes Judith Kelner Djamel Sadok Francisco Airton Silva and 2 more

E-health systems can be used to monitor people in real-time, offering a range of multimedia-based health services, at the same time reducing cost since cheaper devices compose it. However, any downtime, mainly case critical result patient problems and worst case, loss life. In this paper, we use an interdisciplinary approach combining stochastic models with optimisation algorithms analyse how failures impact e-health monitoring system availability. We propose surrogate estimate availability...

10.1504/ijcse.2020.106873 article EN International Journal of Computational Science and Engineering 2020-01-01

Evaluating the cooling subsystem availability on a Cloud data center

OPENALEX - Publications

Demis Gomes Patrícia Takako Endo Glauco Estácio Gonçalves Daniel Rosendo Guto Leoni Santos and 3 more

A data center is divided into three basic subsystems: information technology (IT), power, and cooling. Cooling plays an important role related to availability, a failure in this subsystem may cause interruption of services. Generally, redundant cooling implemented based on replacing the failed component by standby one. However, it also can be rotation computer room air conditioners (CRACs). This paper proposes scalable models that represent behavior evaluate impact failures availability....

10.1109/iscc.2017.8024615 article EN 2022 IEEE Symposium on Computers and Communications (ISCC) 2017-07-01

How to Improve Cloud Services Availability? Investigating the Impact of Power and It Subsystems Failures

OPENALEX - Publications

Daniel Rosendo Guto Leoni Demis Gomes André Moreira Glauco Estácio Gonçalves and 4 more

The cloud data center is a complex system composed of power, cooling, and IT subsystems. power subsystem crucial to feed the equipment. Power disruptions may result in service unavailability. This paper analyzes impact failures on services regarding different architecture configurations based TIA-942 standard such as non-redundant, redundant, concurrently maintainable, fault tolerant. We model both subsystems, IT, through Stochastic Petri Net (SPN). availability results show that tolerant...

10.24251/hicss.2018.193 article EN cc-by-nc-nd Proceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences 2018-01-01

A methodology to assess the availability of next-generation data centers

OPENALEX - Publications

Daniel Rosendo Demis Gomes Guto Leoni Santos Glauco Estácio Gonçalves André Moreira and 6 more

10.1007/s11227-019-02852-3 article EN The Journal of Supercomputing 2019-04-15

The internet of things for healthcare: optimising e-health system availability in the fog and cloud

OPENALEX - Publications

Patrícia Takako Endo Theo Lynn Francisco Airton Silva Djamel Sadok Judith Kelner and 2 more

E-health systems can be used to monitor people in real-time, offering a range of multimedia-based health services, at the same time reducing cost since cheaper devices compose it. However, any downtime, mainly case critical result patient problems and worst case, loss life. In this paper, we use an interdisciplinary approach combining stochastic models with optimisation algorithms analyse how failures impact e-health monitoring system availability. We propose surrogate estimate availability...

10.1504/ijcse.2020.10028625 article EN International Journal of Computational Science and Engineering 2020-01-01

A Standard to Rule Them All: Redfish

OPENALEX - Publications

Glauco Estácio Gonçalves Daniel Rosendo Leylane Ferreira Guto Leoni Santos Demis Gomes and 5 more

Large data centers are complex systems that depend on several generations of hardware and software components, ranging from legacy mainframes rack-based appliances to modular blade servers modern rack scale design solutions. To cope with this heterogeneity, the center manager must coordinate a multitude tools, protocols, standards. Currently, managers, standardization bodies, hardware/software manufacturers joining efforts develop promote Redfish as main management standard for centers, even...

10.1109/mcomstd.2019.1800045 article EN IEEE Communications Standards Magazine 2019-06-01

Maximising the availability of an internet of medical things system using surrogate models and nature-inspired approaches

OPENALEX - Publications

Guto Leoni Santos Demis Gomes Francisco Airton Silva Patrícia Takako Endo Theo Lynn

The emergence of new computing paradigms such as fog and edge provides the Internet Things with needed connectivity high availability. In context e-health systems, wearable sensors are being used to continuously collect information about our health, forward it for processing by Medical (IoMT). E-health systems designed assist subjects in real-time providing them a range multimedia-based health services personalised treatment promise reducing economic burden on systems. Nonetheless, any...

10.1504/ijguc.2022.124381 article EN International Journal of Grid and Utility Computing 2022-01-01

Standardization Efforts for Traditional Data Center Infrastructure Management: The Big Picture

OPENALEX - Publications

Leylane Ferreira Patrícia Takako Endo Daniel Rosendo Guto Leoni Santos Demis Gomes and 6 more

Traditional data center infrastructure suffers from a lack of standard and ubiquitous management solutions. Despite the achieved advances, existing tools interoperability are sometimes hardware dependent. Vendors already actively participating in specification design new software interfaces within different forums. Nevertheless, complexity variety components that includes servers, cooling, networking, power hardware, coupled with introduction defined paradigm, led to parallel development...

10.1109/emr.2020.2969864 article EN IEEE Engineering Management Review 2020-01-27

Availability analysis of design configurations to compose virtual performance‐optimized data center systems in next‐generation cloud data centers

OPENALEX - Publications

Daniel Rosendo Demis Gomes Guto Leoni Santos Leylane Silva André Moreira and 6 more

Summary Next‐generation cloud data centers are based on software‐defined center infrastructures that promote flexibility, automation, optimization, and scalability. The Redfish standard the Intel Rack Scale Design technology enable infrastructure disaggregate bare‐metal compute, storage, networking resources into virtual pools to dynamically compose create performance‐optimized (vPODs) tailored workload‐specific demands. This article proposes four chassis design configurations Distributed...

10.1002/spe.2833 article EN Software Practice and Experience 2020-04-21

Modeling and analyzing power system failures on cloud services

OPENALEX - Publications

Daniel Rosendo Patrícia Takako Endo Guto Leoni Santos Demis Gomes Glauco Estácio Gonçalves and 4 more

Many enterprises rely on cloud infrastructure to host their critical applications (such as trading, banking transaction, airline reservation system, and credit card authorization). The unavailability of these may lead severe consequences that go beyond the financial losses, reaching provider reputation too. However, maintain high availability in a data center is difficult task due its complexity. power subsystem crucial for entire operation because it supplies all other subsystems, including...

10.23919/cnsm.2017.8256034 article EN 2017-11-01

Optimizing the Cloud Data Center Availability Empowered by Surrogate Models

OPENALEX - Publications

Glauco Estácio Gonçalves Demis Gomes Guto Leoni Santos Daniel Rosendo André Moreira and 3 more

Making data centers highly available remains a challenge that must be considered since the design phase. The problem is selecting right strategies and components for achieving this goal given limited investment. Furthermore, center designers currently lack reliable specialized tools to accomplish task. In paper, we disclose formal method chooses optimize availability of while considering budget as constraint. For that, make use stochastic models represent cloud infrastructure based on...

10.24251/hicss.2020.193 article EN cc-by-nc-nd Proceedings of the ... Annual Hawaii International Conference on System Sciences/Proceedings of the Annual Hawaii International Conference on System Sciences 2020-01-01

Maximising the availability of an internet of medical things system using surrogate models and nature-inspired approaches

OPENALEX - Publications

Patrícia Takako Endo Francisco Airton Silva Demis Gomes Guto Leoni Santos Theo Lynn

10.1504/ijguc.2022.10046091 article EN International Journal of Grid and Utility Computing 2022-01-01

Failover time evaluation between checkpoint services in multi-tier stateful applications

OPENALEX - Publications

Demis Gomes Glauco Estácio Gonçalves Moises Bezerra Djamel Sadok Patrícia Takako Endo and 1 more

10.23919/inm.2017.7987361 article EN 2017-05-01

Measuring the impact of data center failures on a cloud‐based emergency medical call system

OPENALEX - Publications

Demis Gomes Guto Leoni Santos Daniel Rosendo Glauco Estácio Gonçalves André Moreira and 3 more

Summary Emergency call services are expected to be highly available in order minimize the loss of urgent calls and, as a consequence, life due lack timely medical response. This service availability depends heavily on cloud data center which it is hosted. However, information alone cannot provide sufficient understanding how failures impact and users' perception. In this paper, we evaluate an emergency system, considering service‐level metrics such number affected per failure time takes...

10.1002/cpe.5156 article EN Concurrency and Computation Practice and Experience 2019-02-13

DCAV: A software system to evaluate next‐generation cloud data center availability through a friendly graphical interface

OPENALEX - Publications

André Moreira Daniel Rosendo Demis Gomes Guto Leoni Santos Leylane Silva and 7 more

Summary To assess the availability of different data center configurations, understand main root causes failures and represent its low‐level details, such as subsystem's behavior their interconnections, we have proposed, in previous works, a set stochastic models to architectures (considering three subsystems: power, cooling, IT) based on TIA‐942 standard. In this paper, propose Data Center Availability (DCAV), web‐based software system allow operators evaluate infrastructure through...

10.1002/spe.2743 article EN Software Practice and Experience 2019-09-11

Prototyping a high availability PaaS: Performance analysis and lessons learned

OPENALEX - Publications

Marcos Machado Daniel Rosendo Demis Gomes André Moreira Moises Bezerra and 3 more

10.23919/inm.2017.7987367 article EN 2017-05-01

Temperature variation impact on estimating costs and most critical components in a cloud data centre

OPENALEX - Publications

Demis Gomes Guto Leoni Djamel Sadok Glauco Estácio Gonçalves Patrícia Takako Endo and 1 more

Cooling plays a very important role in data centre availability by mitigating the overheating of Information Technology (IT) equipment. While many existing works evaluated performance cooling sub-systems centres, only few studies have considered relationship between and IT sub-systems. This work provides efficient models (using Stochastic Petri Nets (SPNs)) to represent sub-system analyse impact its failures terms service downtime financial cost. We provide an model, diminishing state...

10.1504/ijcat.2020.107426 article EN International Journal of Computer Applications in Technology 2020-01-01

Don't lose the point, check it: Is your cloud application using the right strategy

OPENALEX - Publications

Calin Curescu Djamel Sadok Judith Kelner Moisés Rodrigues Patrícia Takako Endo and 2 more

Users pay for running their applications on cloud infrastructure, and in return they expect high availability, minimal data loss case of failure. From a provider perspective, any hardware or software failure must be detected recovered as quickly possible to maintain users' trust avoid financial losses. user's failures transparent should not impact application performance. In order recover failed application, providers perform checkpoints, periodically save data, which can then following...

10.1504/ijguc.2019.10023121 article EN International Journal of Grid and Utility Computing 2019-01-01

Don't lose the point, check it: Is your cloud application using the right strategy

OPENALEX - Publications

Demis Gomes Glauco Estácio Gonçalves Patrícia Takako Endo Moisés Rodrigues Judith Kelner and 2 more

Users pay for running their applications on cloud infrastructure, and in return they expect high availability, minimal data loss case of failure. From a provider perspective, any hardware or software failure must be detected recovered as quickly possible to maintain users' trust avoid financial losses. user's failures transparent should not impact application performance. In order recover failed application, providers perform checkpoints, periodically save data, which can then following...

10.1504/ijguc.2019.102735 article EN International Journal of Grid and Utility Computing 2019-01-01

Temperature variation impact on estimating costs and most critical components in a cloud data centre

OPENALEX - Publications

Paulo Maciel Patrícia Takako Endo Glauco Estácio Gonçalves Guto Leoni Demis Gomes and 1 more

Cooling plays a very important role in data centre availability by mitigating the overheating of Information Technology (IT) equipment. While many existing works evaluated performance cooling sub-systems centres, only few studies have considered relationship between and IT sub-systems. This work provides efficient models (using Stochastic Petri Nets (SPNs)) to represent sub-system analyse impact its failures terms service downtime financial cost. We provide an model, diminishing state...

10.1504/ijcat.2020.10029609 article EN International Journal of Computer Applications in Technology 2020-01-01