Operator-induced structural variable selection for identifying materials genes
Operator (biology)
DOI:
10.48550/arxiv.2110.10195
Publication Date:
2021-01-01
AUTHORS (3)
ABSTRACT
In the emerging field of materials informatics, a fundamental task is to identify physicochemically meaningful descriptors, or genes, which are engineered from primary features and set elementary algebraic operators through compositions. Standard practice directly analyzes high-dimensional candidate predictor space in linear model; statistical analyses then substantially hampered by daunting challenge posed astronomically large number correlated predictors with limited sample size. We formulate this problem as variable selection operator-induced structure (OIS) propose new method achieve unconventional dimension reduction utilizing geometry embedded OIS. Although model remains linear, we iterate nonparametric for effective reduction. This enables based on ab initio features, leading that orders magnitude faster than existing methods, improved accuracy. To select module, discuss desired performance criterion uniquely induced OIS; particular, employ Bayesian Additive Regression Trees (BART)-based method. Numerical studies show superiority proposed method, continues exhibit robust when input out reach methods. Our analysis single-atom catalysis identifies physical descriptors explain binding energy metal-support pairs high explanatory power, interpretable insights guide prevention notorious called sintering aid design.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....