NFDI4DS | UHH-SEMS - Publication Details

P2P-MapReduce: Parallel data processing in dynamic Cloud environments

Implementation

DOI: 10.1016/j.jcss.2011.12.021 Publication Date: 2011-12-31T02:46:23Z

Abstract Supplemental Material References Cited by

AUTHORS (3)

Fabrizio Marozzo

Domenico Talia

Paolo Trunfio

ABSTRACT

AbstractMapReduce is a programming model for parallel data processing widely used in Cloud computing environments. Current MapReduce implementations are based on centralized master-slave architectures that do not cope well with dynamic Cloud infrastructures, like a Cloud of clouds, in which nodes may join and leave the network at high rates. We have designed an adaptive MapReduce framework, called P2P-MapReduce, which exploits a peer-to-peer model to manage node churn, master failures, and job recovery in a decentralized but effective way, so as to provide a more reliable MapReduce middleware that can be effectively exploited in dynamic Cloud infrastructures. This paper describes the P2P-MapReduce system providing a detailed description of its basic mechanisms, a prototype implementation, and an extensive performance evaluation in different network scenarios. The performance results confirm the good fault tolerance level provided by the P2P-MapReduce framework compared to a centralized implementation of MapReduce, as well as its limited impact in terms of network overhead.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (22)

CITATIONS (61)

EXTERNAL LINKS

CROSSREF - Publications OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

P2P-MapReduce: Parallel data processing in dynamic Cloud environments

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....