Advancing Neural Network Performance through Emergence-Promoting Initialization Scheme
Initialization
DOI:
10.48550/arxiv.2407.19044
Publication Date:
2024-07-26
AUTHORS (3)
ABSTRACT
We introduce a novel yet straightforward neural network initialization scheme that modifies conventional methods like Xavier and Kaiming initialization. Inspired by the concept of emergence leveraging measures proposed Li (2023), our method adjusts layer-wise weight scaling factors to achieve higher values. This enhancement is easy implement, requiring no additional optimization steps for compared GradInit. evaluate approach across various architectures, including MLP convolutional architectures image recognition, transformers machine translation. demonstrate substantial improvements in both model accuracy training speed, with without batch normalization. The simplicity, theoretical innovation, demonstrable empirical advantages make it potent practices. These results suggest promising direction improve methodologies. Code available at: https://github.com/johnnyjingzeli/EmergenceInit.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....