This package contains the accompany code for the following paper:
Tapani Raiko, Li Yao, KyungHyun Cho, Yoshua Bengio
Iterative Neural Autoregressive Distribution Estimator (NADE-k).
Advances in Neural Information Processing Systems 2014 (NIPS14).
Download Theano and make sure it's working properly.
All the information you need can be found by following this link:
http://deeplearning.net/software/theano/
Make sure theano is added into your PYTHONPATH.
Very detailed information can be found below:
http://deeplearning.net/software/jobman/install.html.
Make sure jobman is added into your PYTHONPATH.
You can download the dataset from the links below.
[trainset]
(http://www.cs.toronto.edu/~larocheh/public/datasets/binarized_mnist/binarized_mnist_train.amat)
validset
testset
After the dataset has been downloaded, make sure to change the data_path
in utils.py
.
- Change
exp_path
inconfig.py
. This is the directory where all the training outputs are going to be placed. For different experiments, one needs to specify'save_model_path'
in the same config file. - To run NADE-5 1HL in Table 1 of the paper, make sure
'n_layers': 1,
and'l2': 0.0
. - To run NADE-5 2HL in Table 1 of the paper, make sure
'n_layers': 2,
and'l2': 0.0012279827881
. - To start training,
python train_model.py
It is highly recommended the code is run on GPUs. For how to make it happen, take a look at this place: http://deeplearning.net/software/theano/tutorial/using_gpu.html.
During the training, lots of information is printed out on the screen, and many files are written to the save_mode_path
. You will be able to see the plot of drop of the training cost, the generated samples from the model, the log-likelihood on the validset and testset every valid_freq
epochs.
If you use the default setup, the model will be pretrained for 1000 epochs, and finetuned for another 3000 epochs. To have a good generative model, one need to be patient :)
In addition, we have provided some training logs with which you should be able to match your experiments with. See in the directory results
.
After training is done, it is time to get all those SOTA numbers in Table 1 of the paper.
- In
config.py
, change the option'action'
to 1. Meanwhile make sure'from_path'
points to the directory that containsmodel_params_e*.pkl
andmodel_configs.pkl
. The option'epoch'
specify which model over there you would like to use. - Then
python train_model.py
- If all goes well, the evaluation script should be able to produce numbers that match those in the paper.
IMPORTANT: You probably will be surprised when you see better numbers than those reported in our paper. Calm down and we know this could happen. The longer you train our model, the more likely you will get better numbers. And do spread your joy to us when this happens.
NADE-5 1H model:
testset LL over 10 orderings = -89.43
testset LL over 128 ensembles = -85.77
Those numbers are better than what we used in the paper because the model is trained much longer here.
NADE-5 2H model:
testset LL over 10 orderings = -87.13
testset LL over 128 ensembles = -84.65
Questions?
Need a trained model?
Contact us: li.yao@umontreal.ca