On the Provable Generalization of Recurrent Neural Networks
Initialization
Sequence (biology)
DOI:
10.48550/arxiv.2109.14142
Publication Date:
2021-01-01
AUTHORS (4)
ABSTRACT
Recurrent Neural Network (RNN) is a fundamental structure in deep learning. Recently, some works study the training process of over-parameterized neural networks, and show that networks can learn functions notable concept classes with provable generalization error bound. In this paper, we analyze for RNNs random initialization, provide following improvements over recent works: 1) For RNN input sequence $x=(X_1,X_2,...,X_L)$, previous to are summation $f(\beta^T_lX_l)$ require normalized conditions $||X_l||\leq\epsilon$ very small $\epsilon$ depending on complexity $f$. using detailed analysis about tangent kernel matrix, prove bound such without learnable numbers iterations samples scaling almost-polynomially length $L$. 2) Moreover, novel result N-variables form $f(\beta^T[X_{l_1},...,X_{l_N}])$, which do not belong "additive" class, i,e., function $f(X_l)$. And when either $N$ or $l_0=\max(l_1,..,l_N)-\min(l_1,..,l_N)$ small, $f(\beta^T[X_{l_1},...,X_{l_N}])$ will be number
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....