Restoring the Executability of Jupyter Notebooks by Automatic Upgrade of Deprecated APIs

Application programming interface Executable Upgrade Code (set theory)
DOI: 10.1109/ase51524.2021.9678889 Publication Date: 2022-01-20T20:33:49Z
ABSTRACT
Data scientists typically practice exploratory programming using computational notebooks, to comprehend new data and extract insights. To do this they iteratively refine their code, actively trying re-use re-purpose solutions created by other scientists, in real time. However, recent studies have shown that a vast majority of publicly available notebooks cannot be executed out the box. One prominent reasons is deprecation science APIs used such due rapid evolution libraries. In work we propose RELANCER, an automatic technique restores executability broken Jupyter Notebooks, near time, upgrading deprecated APIs. RELANCER employs iterative runtime-error-driven approach identify fix one API issue at This supported machine-learned model which uses runtime error message predict kind repair needed - update or package name, parameter, parameter value. Then creates search space candidate repairs combining knowledge from migration examples on GitHub as well documentation second rank mappings. An evaluation curated dataset 255 un-executable Notebooks Kaggle shows can successfully restore 56% subjects, while baselines relying just only 38% 36% subjects respectively. Further, pursuant its real-time use case, execution 49% within 5 minute time limit, baseline lacking machine learning models 24%.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (80)
CITATIONS (7)