The Anisotropic Noise in Stochastic Gradient Descent: Its Behavior of Escaping from Sharp Minima and Regularization Effects
An implementation for the paper:
Zhanxing Zhu*, Jingfeng Wu*, Bing Yu, Lei Wu, Jinwen Ma
See folder 2Dim
.
One hidden layer experiments
See folder OneHiddenLayer
.
See folder FashionMNIST
.
See folder SVHN
and CIFAR-10
.