Sparse Attentive Backtracking: Temporal CreditAssignment Through Reminding
Backtracking
DOI:
10.48550/arxiv.1809.03702
Publication Date:
2018-01-01
AUTHORS (7)
ABSTRACT
Learning long-term dependencies in extended temporal sequences requires credit assignment to events far back the past. The most common method for training recurrent neural networks, back-propagation through time (BPTT), information be propagated backwards every single step of forward computation, potentially over thousands or millions steps. This becomes computationally expensive even infeasible when used with long sequences. Importantly, biological brains are unlikely perform such detailed reverse replay very internal states (consider days, months, years.) However, humans often reminded past memories mental which associated current state. We consider hypothesis that memory associations between and present could arbitrarily sequences, propagating assigned state Based on this principle, we study a novel algorithm only back-propagates few these skip connections, realized by learned attention mechanism associates relevant states. demonstrate experiments our matches outperforms regular BPTT truncated tasks involving particularly dependencies, but without requiring biologically implausible backward whole history Additionally, proposed transfers longer significantly better than LSTMs trained full self-attention.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....