You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The memory usage is constantly increasing. My computer is 16G memory, but I can't complete the step=10**5 training. Is there any way to solve the problem that the memory will increase during the training?
The above is the change of memory usage when I run. When the current memory usage reaches 70%, the agent will show "missed 7 ~ 9 observations(s)".
And when I try to use "memory-profiler" to monitor memory changes, it shows that the agent.act_and_train() method in marlo\experiments\train_agent.py continues to increase memory usage as the loop progresses, but never releases it.
I don't know if anyone has tried to solve this problem. THANKS~
The text was updated successfully, but these errors were encountered:
I've tried to probe every variable in this function, but I was quickly fainted and it was difficult to know exactly which variable was consuming more memory.
So, I am now trying to batch training, similar to training 10000 step per time, then save the model, extract the model, and restart the training. According to my observation, the model or the agent after re-extracting is not effective at the beginning, but it will recover soon to the previous level and continue to learn.
The memory usage is constantly increasing. My computer is 16G memory, but I can't complete the step=10**5 training. Is there any way to solve the problem that the memory will increase during the training?
The above is the change of memory usage when I run. When the current memory usage reaches 70%, the agent will show "missed 7 ~ 9 observations(s)".
And when I try to use "memory-profiler" to monitor memory changes, it shows that the agent.act_and_train() method in marlo\experiments\train_agent.py continues to increase memory usage as the loop progresses, but never releases it.
I don't know if anyone has tried to solve this problem. THANKS~
The text was updated successfully, but these errors were encountered: