P2P-MapReduce: Parallel data processing in dynamic Cloud environments

Implementation
DOI: 10.1016/j.jcss.2011.12.021 Publication Date: 2011-12-31T02:46:23Z
ABSTRACT
AbstractMapReduce is a programming model for parallel data processing widely used in Cloud computing environments. Current MapReduce implementations are based on centralized master-slave architectures that do not cope well with dynamic Cloud infrastructures, like a Cloud of clouds, in which nodes may join and leave the network at high rates. We have designed an adaptive MapReduce framework, called P2P-MapReduce, which exploits a peer-to-peer model to manage node churn, master failures, and job recovery in a decentralized but effective way, so as to provide a more reliable MapReduce middleware that can be effectively exploited in dynamic Cloud infrastructures. This paper describes the P2P-MapReduce system providing a detailed description of its basic mechanisms, a prototype implementation, and an extensive performance evaluation in different network scenarios. The performance results confirm the good fault tolerance level provided by the P2P-MapReduce framework compared to a centralized implementation of MapReduce, as well as its limited impact in terms of network overhead.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (22)
CITATIONS (61)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....