When is memorization of irrelevant training data necessary for high-accuracy learning?

Memorization ENCODE
DOI: 10.1145/3406325.3451131 Publication Date: 2021-06-16T01:26:13Z
ABSTRACT
Modern machine learning models are complex and frequently encode surprising amounts of information about individual inputs. In extreme cases, appear to memorize entire input examples, including seemingly irrelevant (social security numbers from text, for example). this paper, we aim understand whether sort memorization is necessary accurate learning. We describe natural prediction problems in which every sufficiently training algorithm must encode, the model, essentially all a large subset its examples. This remains true even when examples high-dimensional have entropy much higher than sample size, most that ultimately task at hand. Further, our results do not depend on or class used
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (38)
CITATIONS (24)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....