Improving the Neural GPU Architecture for Algorithm Learning

Decimal
DOI: 10.48550/arxiv.1702.08727 Publication Date: 2017-01-01
ABSTRACT
Algorithm learning is a core problem in artificial intelligence with significant implications on automation level that can be achieved by machines. Recently deep methods are emerging for synthesizing an algorithm from its input-output examples, the most successful being Neural GPU, capable of multiplication. We present several improvements to GPU substantially reduces training time and improves generalization. introduce new technique - hard nonlinearities saturation costs- has general applicability. also diagonal gates applied active-memory models. The proposed architecture first decimal multiplication end-to-end.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....