This repository implements the code for the paper [Overparameterized ReLU Neural Networks Learn the Simplest Model: Neural Isometry and Phase Transitions][https://arxiv.org/abs/2209.15265].
Suppose that
- ReLU networks:
$$ f^\mathrm{ReLU}(\mathbf{X};\Theta) ={(\mathbf{X}{\mathbf{W}}1)}+{\mathbf{w}}_2, \quad \Theta = (\mathbf{W}_1,\mathbf{w}_2), $$
where
- ReLU networks with skip connections:
$$ f^\mathrm{ReLU}(\mathbf{X};\Theta) =\mathbf{X}\mathbf{w}{1,1} w{2,1}+\sum_{i=2}^m (\mathbf{X}\mathbf{w}{1,i})+ w_{2,i}, $$
where
- ReLU networks with normalization layers:
$$ f^\mathrm{ReLU}(\mathbf{X};\Theta) =\sum_{i=1}^m \operatorname{NM}{\alpha_i}((\mathbf{X}\mathbf{w}{1,i})+)w{2,i}, $$
where $\Theta = (\mathbf{W}_1,\mathbf{w}2,\mathbf{\alpha})$ and the normalization operation $\operatorname{NM}\alpha(\mathbf{v})$ is defined by
We consider the regularized training problem
When
We include code to solve convex optimization formulations of the minimal norm problem and to train nerual networks discussed in the paper, respectively. We also include code to plot the phase transition graphs shown in the paper.
More details about the numerical experiments can be found in the appendix of the paper.
When solving convex programs, CVXPY (version>=1.1.13) is needed. Mosek solver is preferred. You can also change the solver according to the documentation of CVXPY.
When training neural networks discussed in the paper, PyTorch (version>=1.10.0) is needed.
Compute the recovery rate of the planted linear neuron by solving the minimal norm problem for ReLU networks with skip connections over 5 independent trials.
python rec_rate_skip.py --n 400 --d 100 --sample 5 --sigma 0 --optw 0 --optx 0
Compute the absolute distance by solving the convex programs over 5 independent trials.
# minimal norm problem
python minnrm_skip.py --n 400 --d 100 --sample 5 --sigma 0 --optw 0 --optx 0
# convex training problem
python cvx_train_skip.py --n 400 --d 100 --sample 5 --sigma 0 --optw 0 --optx 0
Compute the test distance by training ReLU networks with skip connections over 10 independent trials.
python ncvx_train_skip.py --n 400 --d 100 --sample 10 --sigma 0 --optw 0 --optx 0
- You can change
--save_details
,--save_folder
,--seed
accordingly. - For ReLU networks with normalization layer, you can also set the number of planted neurons by changing
--neu
. Details about the supported types of planted neuron and data matrix can be found in the comments of the code.
Yixuan Hua (yh7422@princeton.edu)
Yifei Wang (wangyf18@stanford.edu)
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
If you find this repository helpful, please consider citing: