meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting
Overfitting
Backpropagation
Code (set theory)
Speedup
DOI:
10.48550/arxiv.1706.06197
Publication Date:
2017-01-01
AUTHORS (4)
ABSTRACT
We propose a simple yet effective technique for neural network learning. The forward propagation is computed as usual. In back propagation, only small subset of the full gradient to update model parameters. vectors are sparsified in such way that top-$k$ elements (in terms magnitude) kept. As result, $k$ rows or columns (depending on layout) weight matrix modified, leading linear reduction ($k$ divided by vector dimension) computational cost. Surprisingly, experimental results demonstrate we can 1-4% weights at each pass. This does not result larger number training iterations. More interestingly, accuracy resulting models actually improved rather than degraded, and detailed analysis given. code available https://github.com/lancopku/meProp
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....