Probing out-of-distribution generalization in machine learning for materials
Condensed Matter - Materials Science
TA401-492
Materials Science (cond-mat.mtrl-sci)
FOS: Physical sciences
Materials of engineering and construction. Mechanics of materials
DOI:
10.48550/arxiv.2406.06489
Publication Date:
2024-06-10
AUTHORS (8)
ABSTRACT
Scientific machine learning (ML) endeavors to develop generalizable models with broad applicability. However, the assessment of generalizability is often based on heuristics. Here, we demonstrate in materials science setting that heuristics evaluations lead substantially biased conclusions ML and benefits neural scaling. We evaluate generalization performance over 700 out-of-distribution tasks features new chemistry or structural symmetry not present training data. Surprisingly, good found most across various including simple boosted trees. Analysis representation space reveals contain test data lie regions well covered by data, while poorly-performing mainly outside domain. For latter case, increasing set size time has marginal even adverse effects performance, contrary what scaling paradigm assumes. Our findings show heuristically-defined tests are genuinely difficult only ability interpolate. Evaluating such rather than truly challenging ones can an overestimation
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....