Understanding the Loss Surface of Neural Networks for Binary Classification

Maxima and minima Hinge loss Counterexample Binary classification
DOI: 10.48550/arxiv.1803.00909 Publication Date: 2018-01-01
ABSTRACT
It is widely conjectured that the reason training algorithms for neural networks are successful because all local minima lead to similar performance, example, see (LeCun et al., 2015, Choromanska Dauphin 2014). Performance typically measured in terms of two metrics: performance and generalization performance. Here we focus on single-layered binary classification, provide conditions under which error zero at a smooth hinge loss function. Our roughly following form: neurons have be strictly convex surrogate function should version loss. We also counterexamples show when replaced with quadratic or logistic loss, result may not hold.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....