Adaptive-saturated RNN: Remember more with less instability
Code (set theory)
Sequence (biology)
DOI:
10.48550/arxiv.2304.11790
Publication Date:
2023-01-01
AUTHORS (3)
ABSTRACT
Orthogonal parameterization is a compelling solution to the vanishing gradient problem (VGP) in recurrent neural networks (RNNs). With orthogonal parameters and non-saturated activation functions, gradients such models are constrained unit norms. On other hand, although traditional vanilla RNNs seen have higher memory capacity, they suffer from VGP perform badly many applications. This work proposes Adaptive-Saturated (asRNN), variant that dynamically adjusts its saturation level between two mentioned approaches. Consequently, asRNN enjoys both capacity of RNN training stability RNNs. Our experiments show encouraging results on challenging sequence learning benchmarks compared several strong competitors. The research code accessible at https://github.com/ndminhkhoi46/asRNN/.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....