An invariant framework for conducting reproducible computational science
0103 physical sciences
01 natural sciences
DOI:
10.1016/j.jocs.2015.04.012
Publication Date:
2015-04-18T14:31:55Z
AUTHORS (6)
ABSTRACT
Abstract Computational reproducibility depends on the ability to not only isolate necessary and sufficient computational artifacts but also to preserve those artifacts for later re-execution. Both isolation and preservation present challenges in large part due to the complexity of existing software and systems as well as the implicit dependencies, resource distribution, and shifting compatibility of systems that result over time—all of which conspire to break the reproducibility of an application. Sandboxing is a technique that has been used extensively in OS environments in order to isolate computational artifacts. Several tools were proposed recently that employ sandboxing as a mechanism to ensure reproducibility. However, none of these tools preserve the sandboxed application for re-distribution to a larger scientific community aspects that are equally crucial for ensuring reproducibility as sandboxing itself. In this paper, we describe a framework of combined sandboxing and preservation, which is not only efficient and invariant, but also practical for large-scale reproducibility. We present case studies of complex high-energy physics applications and show how the framework can be useful for sandboxing, preserving, and distributing applications. We report on the completeness, performance, and efficiency of the framework, and suggest possible standardization approaches.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (22)
CITATIONS (20)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....