-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy task #6
Comments
Training the NTM on short sequencesI have trained the NTM on sequences of It does not generalizes to longer sequences yet though. When I test on longer sequences it sometimes repeats the last input vector multiple times (we can also clearly see it on the Read Weights). Parameters of the experimentSame parameters settings as in #4. |
Training on short sequences & generalization (cheating)The successful example from #6 (comment) shows that the head writes on random locations in the memory, but is still able to retrieve these locations afterwards. A more natural approach (and what the results from DeepMind show) would be to write on adjacent addresses at every time step, and it would suggest that either:
I decided to run a test where I simply skip the content addressing and gating to see if it would indeed have the intended behavior (writing on consecutive addresses). It turns out it worked really well (when it converges, see below) and shows amazing generalization capabilities. In this example I still trained the NTM on sequences of Parameters of the experimentOverall same settings as in #4, without any content addressing or gating (in other words Issues
|
Copy task
I will gather all the progress on the Copy task in this issue. I will likely update this issue regularly (hopefully), so you may want to unsubscribe from this issue if you don't want to get all the spam.
The text was updated successfully, but these errors were encountered: