Are All Unseen Data Out-of-Distribution?

Training set Data set
DOI: 10.48550/arxiv.2312.16243 Publication Date: 2023-01-01
ABSTRACT
While deep neural networks can achieve good performance on in-distribution samples, their generalization ability significantly degrades under unknown test shifts. We study the problem of out-of-distribution (OOD) capability models by exploring relationship between error and training set size. Previous empirical evidence suggests that falls off as a power size lower errors indicate better model generalization. However, in case OOD this is not true from our observations. Counterintuitively, increasing data does always lead to decrease error. Such non-decreasing phenomenon formally investigated linear setting with verification across varying visual benchmarks. To investigate above results, we redefine located outside convex hull mixture prove new bound. Together observations highlight effectiveness well-trained be guaranteed within mixture. For beyond coverage, may unassured. without knowledge target environments, demonstrate multiple strategies including augmentation pre-training. also employ novel selection algorithm outperforms baselines.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....