Rationale for training only on 10% of the buffer ? #7

sabeaussan · 2025-02-25T17:54:51Z

When defining the batch size in training_go.py, you comment : 'To avoid overfitting, we want to make sure the agent only sees ~10% of samples in the replay over one checkpoint.' 'That is, batch_size * ckpt_interval <= replay_capacity * 0.1'. Can you expand on this choice ? Intuitively training on a small sample of the buffer will foster overfitting rather than prevent it doesn't it ? Can you explain more in details this choice please :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rationale for training only on 10% of the buffer ? #7

Rationale for training only on 10% of the buffer ? #7

sabeaussan commented Feb 25, 2025

Rationale for training only on 10% of the buffer ? #7

Rationale for training only on 10% of the buffer ? #7

Comments

sabeaussan commented Feb 25, 2025