Interpretable sparse SIR for functional data

FOS: Computer and information sciences Functional regression [SDV]Life Sciences [q-bio] functional regression MODELS Mathematics - Statistics Theory Statistics Theory (math.ST) CLASSIFICATION Interval selection 519 Methodology (stat.ME) SLICED INVERSE REGRESSION ridge regression FOS: Mathematics [INFO]Computer Science [cs] DIMENSION [MATH]Mathematics [math] lasso Functional regression;SIR;Lasso;Ridge regression;Interval selection;SLICED INVERSE REGRESSION;VARIABLE SELECTION;DIMENSION;CLASSIFICATION;MODELS;INPUTS Statistics - Methodology INPUTS interval selection VARIABLE SELECTION Ridge regression SIR Lasso [STAT.ME]Statistics [stat]/Methodology [stat.ME]
DOI: 10.1007/s11222-018-9806-6 Publication Date: 2018-03-02T03:16:31Z
ABSTRACT
This work focuses on the issue of variable selection in functional regression. Unlike most work in this framework, our approach does not select isolated points in the definition domain of the predictors, nor does it rely on the expansion of the predictors in a given functional basis. It provides an approach to select full intervals made of consecutive points. This feature improves the interpretability of the estimated coefficients and is desirable in the functional framework for which small shifts are frequent when comparing one predictor (curve) to another. Our method is described in a semiparametric framework based on Sliced Inverse Regression (SIR). SIR is an effective method for dimension reduction of high-dimensional data which computes a linear projection of the predictors in a low-dimensional space, without loss on regression information. We extend the approaches of variable selection developed for multidimensional SIR to select intervals rather than separated evaluation points in the definition domain of the functional predictors. Different and equivalent formulations of SIR are combined in a shrinkage approach with a group-LASSO-like penalty. Finally, a fully automated iterative procedure is also proposed to find the critical (interpretable) intervals. The approach is proved efficient on simulated and real data.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (48)
CITATIONS (11)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....