-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about training rl model #23
Comments
I find that this is because q value is very small, almost zero. Is that normal? |
I remembered that the loss is very small from the beginning. |
I turned off the early stop, otherwise it would stop after four checkpoints. Now I have trained 4 epoch, but the loss is still -0.0001. Should I wait for it to train for 20 epochs? |
Sorry I do not have any pre-trained models since it was more than three years. I remembered that the best checkpoint is usually located at the 3rd or 4th epoch. So it is reasonable that the training scripts stop at the 4th epoch. I think it is normal to have a small loss in my RL training code. |
can i ask some questions |
why the precition is all |
No description provided. |
I have trained catSeq model and its performance is as your reported. When I use
python3 train.py -data data/kp20k/kp20k_separated/rl/ -vocab data/kp20k/kp20k_separated/rl/ -exp_path=exp -exp catSeq_rl_kp20k -epochs 20 -model_path=model/catSeq_rl_9527 -copy_attention -train_rl -one2many -one2many_mode 1 -batch_size 32 -separate_present_absent -pretrained_model=model/catSeq_9527/catSeq_kp20k.ml.one2many.cat.copy.bi-directional.epoch=3.batch=38098.total_batch=120000.model -max_length 60 -baseline self -reward_type 7 -replace_unk -topk G -seed=9527
to train a rl model, its loss is wrong from the beginning, and it always -0.000x. What do you think might be the problem?The text was updated successfully, but these errors were encountered: