about training rl model #23

sjchasel · 2022-08-14T11:14:41Z

I have trained catSeq model and its performance is as your reported. When I use
python3 train.py -data data/kp20k/kp20k_separated/rl/ -vocab data/kp20k/kp20k_separated/rl/ -exp_path=exp -exp catSeq_rl_kp20k -epochs 20 -model_path=model/catSeq_rl_9527 -copy_attention -train_rl -one2many -one2many_mode 1 -batch_size 32 -separate_present_absent -pretrained_model=model/catSeq_9527/catSeq_kp20k.ml.one2many.cat.copy.bi-directional.epoch=3.batch=38098.total_batch=120000.model -max_length 60 -baseline self -reward_type 7 -replace_unk -topk G -seed=9527 to train a rl model, its loss is wrong from the beginning, and it always -0.000x. What do you think might be the problem？

The text was updated successfully, but these errors were encountered:

sjchasel · 2022-08-14T11:44:06Z

I find that this is because q value is very small, almost zero. Is that normal?

kenchan0226 · 2022-08-16T02:23:17Z

I remembered that the loss is very small from the beginning.

sjchasel · 2022-08-16T08:29:57Z

I remembered that the loss is very small from the beginning.

I turned off the early stop, otherwise it would stop after four checkpoints. Now I have trained 4 epoch, but the loss is still -0.0001. Should I wait for it to train for 20 epochs?
I don't know if this is normal for reinforcement learning models right now. How many epochs did you train your reinforcement learning model? Or if you have any trained rl models, could you share them? Thank you.

kenchan0226 · 2022-08-17T21:29:52Z

Sorry I do not have any pre-trained models since it was more than three years. I remembered that the best checkpoint is usually located at the 3rd or 4th epoch. So it is reasonable that the training scripts stop at the 4th epoch. I think it is normal to have a small loss in my RL training code.

Struggle-lsl · 2022-11-12T02:39:39Z

can i ask some questions

Struggle-lsl · 2022-11-13T08:27:30Z

why the precition is all

Struggle-lsl · 2022-11-13T08:27:39Z

No description provided.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about training rl model #23

about training rl model #23

sjchasel commented Aug 14, 2022

sjchasel commented Aug 14, 2022

kenchan0226 commented Aug 16, 2022

sjchasel commented Aug 16, 2022

kenchan0226 commented Aug 17, 2022

Struggle-lsl commented Nov 12, 2022

Struggle-lsl commented Nov 13, 2022

Struggle-lsl commented Nov 13, 2022

about training rl model #23

about training rl model #23

Comments

sjchasel commented Aug 14, 2022

sjchasel commented Aug 14, 2022

kenchan0226 commented Aug 16, 2022

sjchasel commented Aug 16, 2022

kenchan0226 commented Aug 17, 2022

Struggle-lsl commented Nov 12, 2022

Struggle-lsl commented Nov 13, 2022

Struggle-lsl commented Nov 13, 2022