On the Stability and Generalization of Learning with Kernel Activation Functions

Kernel (algebra)
DOI: 10.48550/arxiv.1903.11990 Publication Date: 2019-01-01
ABSTRACT
In this brief we investigate the generalization properties of a recently-proposed class non-parametric activation functions, kernel functions (KAFs). KAFs introduce additional parameters in learning process order to adapt nonlinearities individually on per-neuron basis, exploiting cheap expansion every value. While increase flexibility has been shown provide significant improvements practice, theoretical proof for its capability not addressed yet literature. Here, leverage recent literature stability non-convex models trained via stochastic gradient descent (SGD). By indirectly proving two key smoothness under consideration, prove that neural networks endowed with generalize well when SGD finite number steps. Interestingly, our analysis provides guideline selecting one hyper-parameters model, bandwidth scalar Gaussian kernel. A short experimental evaluation validates proof.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....